[Remote] Lead AI Architect and Engineer
Note: The job is a remote job and is open to candidates in USA. Leorna is hiring a lead AI Agent Engineer / Architect to design and build the agentic systems at the heart of their product. The role involves architecting retrieval pipelines, designing multi-step agents, and building the workflow and automation infrastructure for LLM-powered features that work in production.
Responsibilities
- Architect and own end-to-end RAG pipelines: ingestion, chunking strategy, embeddings, vector storage, retrieval, reranking, and answer synthesis
- Design multi-step agents using tool use, function calling, and structured output — and decide when an agent is the right answer vs. a deterministic workflow
- Build internal automations and pipelines that connect LLMs to the rest of our systems (databases, APIs, queues, schedulers, third-party SaaS)
- Choose and integrate the right tools across the stack: LLM providers (Anthropic, OpenAI, open-weights), vector DBs (Pinecone, Weaviate, Qdrant, pgvector), orchestration frameworks (LangGraph, LlamaIndex, custom), eval harnesses (Braintrust, LangSmith, custom)
- Define and run evaluations: golden sets, regression suites, online and offline metrics. Treat eval as production code, not a notebook afterthought
- Instrument observability: tracing, prompt and response logs, cost and latency budgets, drift detection
- Handle the unsexy production concerns: rate limits, retries, backoff, idempotency, timeouts, caching, fallbacks across providers, prompt-injection defense, and PII handling
- Partner with engineering to expose agent capabilities as clean APIs that frontends and backends can consume
- Educate the team — set internal standards for prompt design, retrieval patterns, agent boundaries, and failure handling
Skills
- 10+ years of professional software engineering experience overall, with strong fundamentals in Python or TypeScript
- Hands-on experience designing and shipping production RAG pipelines: chunking strategy, embeddings, vector search, hybrid retrieval, reranking, and citation/grounding
- Built and operated agent systems in production: tool use / function calling, planning and reflection patterns (e.g., ReAct, plan-and-execute), structured output, and multi-step orchestration
- Deep practical knowledge of at least one major LLM provider API (Anthropic, OpenAI, etc.) and at least one orchestration framework (LangGraph, LlamaIndex, Haystack, or a justified custom stack)
- Hands-on experience with at least one vector store (Pinecone, Weaviate, Qdrant, Chroma, pgvector, etc.) and an opinion about when to use which
- Strong grasp of evaluation: building golden sets, automated grading (LLM-as-judge with sanity checks), regression testing, and online metrics
- Experience integrating LLM-powered features into real backend systems via REST/GraphQL/queues/webhooks — not just notebooks or chat UIs
- Demonstrated production use of AI coding tools (Claude Code, Cursor, Copilot) in your daily workflow
- Excellent written communication; can write a one-page design doc that an engineering team can ship from
- Located in the United States
- Built workflow/automation systems (Temporal, Inngest, Airflow, Prefect, n8n, custom) and understand when each fits
- Experience with open-weights models (Llama, Mistral, Qwen) and self-hosted inference (vLLM, TGI, Ollama)
- Fine-tuning, distillation, or DPO experience — and a clear point of view on when it's worth the cost
- Background in classical ML, IR, or NLP that informs your retrieval and ranking choices
- Experience with prompt-injection defense, jailbreak red-teaming, and LLM safety patterns
- Open-source contributions to AI/agent frameworks
- Experience deploying agentic systems to enterprise customers with audit, compliance, and SOC2 considerations
- Familiarity with MCP (Model Context Protocol) and broader agent-tool standards
Benefits
- Meaningful equity as part of the founding team.
- Generous PTO and parental leave.
- Annual learning + AI tooling budget (model API credits, eval tools, conferences).
- Hardware of your choice.
- A small, senior team that ships.
Company Overview