[Remote] Senior Staff Machine Learning Engineer, Data & Eval

Work from home Full-time role Hiring

Note: The job is a remote job and is open to candidates in USA. Airbnb is a leading hospitality company that connects hosts and guests for unique stays and experiences. They are seeking a Senior Staff Machine Learning Engineer to set technical direction and lead execution for ML evaluation and data systems that power customer support AI products.

Responsibilities

Define evaluation strategy and success metrics for GenAI systems, aligning offline evaluation with online business and customer experience outcomes
Build and scale evaluation frameworks (golden sets, synthetic data, automated regressions, rubric-based grading, LLM-as-judge where appropriate) with strong controls for bias, drift, and reliability
Design the data flywheel: instrumentation, feedback collection, data quality checks, labeling strategy, dataset versioning, and governance to support continuous improvement
Lead cross-functional quality initiatives across product, ops, and engineering, driving clarity on what “good” looks like and how teams act on evaluation results
Develop and productionize pipelines for dataset creation, model monitoring, evaluation-at-scale, and continuous testing (pre-deploy and post-deploy)
Drive technical decisions and architecture for evaluation and data infrastructure, balancing speed, rigor, cost, and safety

Skills

Educational Background: PhD in Computer Science, Mathematics, Statistics, or related technical field (or equivalent practical experience)
Industry Experience: 10+ years building, testing, and shipping ML/AI systems end-to-end; including 2+ years of experience with GenAI/LLM systems in production
Leadership Experience: 5+ years leading large, ambiguous technical initiatives as a senior IC, influencing roadmap and engineering/science direction across teams
Technical Proficiency: Deep expertise in evaluation methodology (offline/online alignment, metric design, human-in-the-loop evaluation, A/B testing, power analysis, regression testing)
Hands-on experience with GenAI systems, including orchestration, retrieval, tool calling, memory, etc
Experience building data pipelines and quality systems (labeling workflows, dataset curation, versioning, monitoring, and governance)
Solid ML fundamentals and best practices (model selection, training/serving, monitoring, reliability, and model lifecycle management)
Customer Support Systems: Experience applying ML/AI to customer support workflows (e.g., agent assist, classification/routing, resolution recommendation, QA)
Infrastructure & Quality at Scale: Experience building robust evaluation platforms for agent behavior validation, safety/guardrails, and continuous improvement
Agile Practice for Applied AI: Proven ability to take evaluation and data flywheel work from incubation to production, iterating quickly while maintaining scientific rigor

Benefits

Bonus
Equity
Benefits
Employee Travel Credits

Company Overview

Airbnb is an online community marketplace for people to list, discover, and book accommodations through mobile phones or the Internet. It was founded in 2008, and is headquartered in San Francisco, California, USA, with a workforce of 5001-10000 employees. Its website is https://www.airbnb.com.

Company H1B Sponsorship

Airbnb has a track record of offering H1B sponsorships, with 59 in 2026, 234 in 2025, 176 in 2024, 160 in 2023, 270 in 2022, 250 in 2021, 274 in 2020. Please note that this does not guarantee sponsorship for this specific role.

Apply Now

[Remote] Senior Staff Machine Learning Engineer, Data & Eval

More open positions

[Remote] Account Manager

[Remote] Financial Analyst

[Remote] REMOTE - Information Security Engineer III - R12693

[Remote] Growth Media Sr. Strategist

[Remote] Network Engineer - Patient Monitoring (Field: Philadelphia/Allentown/Scranton, PA or Mercerville, NJ)

Associate AI/ML Engineer

Remote Registered Nurse with compact license

Part-Time Evening Jobs – Online/Offline Options for Students

Experienced Data Entry Clerk (Entry Level) - Remote Jobs at careerzynith

[Remote] Analyst II, Full Stack (Credit Analytics)

Territory Manager - Akron, OH

Procurement & Purchasing Data Analyst

Remote Senior Data Engineer – Cloud‑Native Data Architecture, Analytics & DevOps Solutions at careerzynith – $27/hr

WebSphere Commerce Developer - W2 only

Research Scientist (Remote)

Remote Customer Service Representative – Travel & Aviation Support for careerzynith – Flexible Shifts, Global Reach, Career Growth

Experienced Chat Support Officer - Work from Home Opportunity at careerzynith

Paid Media Analyst

Nurse Navigator - Remote

Remote Pricing Actuary – Direct Markets & Portfolio Insights

Remote Freight Dispatcher - High Earnings Opportunity