← all jobs

[Remote] Senior AI Agent & Evaluations Engineer

Work from home Full-time role Hiring

Note: The job is a remote job and is open to candidates in USA. Vacatia is building the future of vacation ownership, focusing on transforming the industry through AI. They are seeking a Senior AI Agent & Evaluations Engineer to design and improve AI agents that directly impact customer experiences and operational efficiency, while owning the intelligence layer behind these systems.

Responsibilities

  • Design, refine, and optimize prompts, tool definitions, routing logic, and decision-making behavior across Vacatia's AI agent ecosystem
  • Build and maintain evaluation frameworks, golden datasets, grading systems, and regression testing pipelines that measure agent quality and reliability
  • Develop guardrails and safe-failure mechanisms that ensure agents operate responsibly in customer-facing and financially sensitive workflows
  • Monitor production performance, investigate failures, identify edge cases, and continuously improve agent outcomes through data-driven iteration
  • Partner with business stakeholders to translate policies, operational requirements, and domain expertise into measurable agent behavior
  • Collaborate with engineering teams to define context requirements, tool contracts, and integration specifications that support agent success
  • Create scalable frameworks and reusable patterns for deploying AI agents across new business workflows and use cases
  • Establish best practices for prompt engineering, evaluation methodologies, observability, and agent operations

Skills

  • Proven experience shipping and owning production AI agents or LLM-powered systems beyond proof-of-concept environments
  • Deep expertise in prompt engineering, including system prompts, tool usage, context management, output constraints, and agent behavior design
  • Hands-on experience building evaluation frameworks using golden datasets, scoring rubrics, LLM-as-judge methodologies, and regression testing
  • Strong familiarity with modern AI development tools such as Claude Code, Codex, or similar coding agents
  • Experience with agent observability and evaluation platforms such as LangSmith, Langfuse, Arize, Galileo, or comparable solutions
  • Ability to distinguish prompt issues from data, tooling, model, or evaluation failures and systematically improve agent performance
  • Strong written and verbal communication skills with the ability to work effectively across engineering and business teams
  • Demonstrated ownership mindset with a passion for building reliable, measurable, and continuously improving AI systems
  • Experience building agents that process communication-based workflows including emails, support tickets, chat interactions, or transcripts
  • Experience with multiple agent frameworks and a practical understanding of their tradeoffs
  • Familiarity with the evolving LLM landscape and model selection strategies
  • Experience designing and implementing end-to-end evaluation pipelines and agent operations workflows
  • Production experience with online evaluation systems and automated scoring of live traffic
  • Experience integrating AI systems with Salesforce, AWS Connect, or customer engagement platforms
  • Background in customer-facing industries where accuracy, compliance, and communication quality are critical
  • Contributions to open-source projects, technical writing, or public thought leadership in AI, prompt engineering, or agent development

Company Overview

  • Vacatia is the resort marketplace for vacationing families, whose mission is to make family vacations better It was founded in 2013, and is headquartered in Mill Valley, California, USA, with a workforce of 1001-5000 employees. Its website is https://vacatia.com.
  • Company H1B Sponsorship

  • Vacatia has a track record of offering H1B sponsorships, with 2 in 2025, 1 in 2022. Please note that this does not guarantee sponsorship for this specific role.
  • More open positions

    [Remote] Staff Back End Engineer, Trading

    Work from home Full-time role

    [Remote] Senior Accountant

    Work from home Full-time role

    [Remote] Manager, Software Engineering (Reliability Platform)

    Work from home Full-time role

    [Remote] Community Support Forecasting and Demand Planning Analyst

    Work from home Full-time role

    [Remote] Senior Manager, Clinical Operations

    Work from home Full-time role

    IT entrepreneur (internal startups)

    Work from home Full-time role

    Financial Protection Advisor

    Work from home Full-time role

    Software Dev Engr II

    Work from home Full-time role

    Lifecycle / CRM Manager (Email & SMS) for Ecommerce

    Work from home Full-time role

    Zonal Business Head (Barielly, UP)-Agriculture background

    Work from home Full-time role

    [Remote] Residential Title Examiner

    Work from home Full-time role

    [Remote] Senior Key Account Manager

    Work from home Full-time role

    HR Specialist - Global HR Systems

    Work from home Full-time role

    Experienced Remote Part-Time Online Live Chat Support Specialist – Work From Home Customer Service Representative with Growth Opportunities

    Work from home Full-time role

    [Remote] Senior Product Manager II - AI Platform & Agentic Experience

    Work from home Full-time role

    Principal Clinical Database Manager

    Work from home Full-time role

    Marketing Coordinator | Fully Remote (UK)

    Work from home Full-time role

    Remote Data Entry Specialist – Precise, Detail‑Focused Data Management Professional for Global Operations

    Work from home Full-time role

    Remote Data Entry Specialist – Part-Time Work From Home Opportunity with careerzynith

    Work from home Full-time role

    Experienced Full Stack Content/Communications Writer – Web & Cloud Application Development

    Work from home Full-time role

    Product Designer - 12-month contract

    Work from home Full-time role