[Remote] Staff Machine Learning Engineer
Note: The job is a remote job and is open to candidates in USA. AppFolio is an innovative technology leader in the real estate industry, seeking a Staff Machine Learning Engineer to advance their ML platform. The role involves designing and operating the ML infrastructure, optimizing AI costs, and collaborating with engineering teams to ensure production readiness of AI systems.
Responsibilities
- Design and operate AppFolio's ML infrastructure on AWS — ECS, SageMaker, GPU fleets, model serving, autoscaling, and cost controls
- Optimize cost across all AI applications — provider routing, caching, batch vs. real-time, model size selection, and inference economics
- Maintain reliable, multi-provider LLM access across Google, OpenAI, and Anthropic with sensible fallbacks and abstractions
- Build the training and fine-tuning stack for Small Language Models, including data pipelines, GPU orchestration, and evaluation
- Partner with Voice & Agents and Research ML engineers to harden their prototypes into production systems with SLOs, on-call rotations, and observability
- Operate AppFolio's AI safety and authorization layer — guardrails on AWS, scoped tool permissions, and human-in-the-loop gates for autonomous agent actions
Skills
- ML infra at scale: Has built and operated production ML infrastructure on AWS — ECS, SageMaker, GPUs, autoscaling, and cost controls
- Inference platforms: Production experience with model serving for both LLMs and custom models; understands quantization, batching, and routing
- Provider breadth: Direct experience integrating with Google (Vertex / Gemini), OpenAI, and Anthropic APIs in production
- Training capability: Has trained or fine-tuned language models end-to-end; comfortable with deep learning, evaluation, and inference
- Cloud-native engineering: Strong Python, Docker, dependency management, and CI/CD for AI workloads
- RAG & agents: Working knowledge of LangChain / LangGraph and modern RAG patterns over structured and unstructured data
- Cost optimization: Demonstrated experience reducing unit cost of AI workloads without regressing quality or latency
- AI safety & authorization: Hands-on experience operating AI guardrails, scoped tool permissions, and authorization layers for production AI systems
- Systems thinker: You think in terms of platforms and long-term leverage, not just features
- Production builder: You've built and scaled ML infrastructure in production with meaningful business impact
- Ambiguity: You operate effectively in high ambiguity, turning unclear infra problems into clear direction
- Owner-operator: You take ownership with a founder/owner-operator mindset, act with urgency, and focus on outcomes
- Pace: You have a strong desire to move fast and deliver impact, while maintaining sound engineering judgment
- Collaboration: You are humble, collaborative, and low-ego, and you elevate those around you
- Sustainability: You value work-life balance as a foundation for sustained high performance
- Reliability mindset: You treat ML infra like any other production system — SLOs, on-call, observability, postmortems
- Experience training Small Language Models for production use
- GPU performance tuning (vLLM, TensorRT, Triton, or similar)
- Prior Staff-level role at a company with a significant AI infra footprint
- Experience with ontology-driven systems or knowledge graphs supporting AI applications
- Contributions to open-source ML infrastructure or LLM tooling
Benefits
- Regular full-time employees are eligible for benefits - [see here](https://www.appfolio.com/company/careers#Benefits).
- We enable a culture of high performance, where delivering results is recognized by opportunities for growth and compelling total rewards.
- We partner with you to realize your potential by investing in you from the start.
- We're cultivating a team of big thinkers through coaching and mentorship with our best-in-class leaders, and giving you the time and tools to develop your skills.
- We excel at hybrid work by fostering an environment that feels flexible, personal and connected, no matter where we are.
- We create space to fuel innovation and collaboration, and we come together to celebrate, connect, and succeed.
Company Overview
Company H1B Sponsorship