[Remote] Senior Software Engineer, AI Platform
Note: The job is a remote job and is open to candidates in USA. Isolved is a leading provider of human capital management (HCM) solutions that combines modern technology with expert services and support. The Senior Software Engineer, AI Platform role involves leading a team of engineers to design and build scalable AI solutions, ensuring program execution and technical direction while addressing complex challenges.
Responsibilities
- Design and build a scalable LLM gateway with model routing, prompt management, cost attribution, rate limiting, and caching
- Develop and operate RAG pipelines, embedding services, and vector search infrastructure for platform-wide use
- Implement platform-level cost optimization strategies, including semantic caching and model selection by workload
- Build and maintain agentic runtime infrastructure, including orchestration, state management, and human-in-the-loop patterns
- Develop extensible MCP server and tool ecosystems for product team integration
- Design and support multi-agent coordination patterns using modern frameworks and protocols
- Establish comprehensive AI observability, including usage, latency, cost tracking, and distributed tracing
- Implement AI governance controls, including access management, audit logging, content filtering, and security protections
- Build AI incident detection and response capabilities, including monitoring for failures, hallucinations, and cost anomalies
- Create developer-friendly SDKs across languages (Python, .NET, TypeScript) to simplify platform adoption
- Define 'paved road' patterns for common AI use cases and support onboarding of product teams
- Build automated evaluation pipelines and continuously monitor production quality and model performance
Skills
- + 5+ years of professional software engineering experience, with Python as your primary language
- + 2+ years building production LLM-powered systems - inference, RAG, agentic patterns, or AI infrastructure
- + Deep Python expertise - this is the primary language for AI platform work
- + Working proficiency in C#/.NET - the platform serves teams that live in C#, so interop is real and matters
- + Strong hands-on experience with agentic frameworks - Semantic Kernel, LangGraph, LangChain, or you've built your own
- + Production experience with RAG architecture: chunking strategies, embedding models, vector search, retrieval quality, and the failure modes that don't show up in demos
- + Azure AI Foundry / Azure OpenAI experience - model deployment, API integration, observability tooling
- + Experience building internal platforms or SDKs that other engineers depend on - you understand what makes a platform feel good to use
- + Strong grasp of AI observability: token usage, latency, cost tracking, and distributed tracing across multi-agent workflows
- Experience with TypeScript and building developer SDKs or tooling
- Hands-on experience with AI evaluation frameworks (LLM-as-judge, automated regression testing)
- Knowledge of AI governance practices, including access control, audit logging, and security safeguards
- Familiarity with container-based deployments (e.g., Azure Container Apps) and infrastructure-as-code (Terraform)
- Awareness of AI regulatory frameworks such as NIST AI RMF or ISO/IEC 42001
Company Overview
Company H1B Sponsorship