[Remote] Architect - Platform Engineering - USA
Note: The job is a remote job and is open to candidates in USA. Quantiphi is an award-winning, AI-First digital engineering and consulting company focused on delivering high-impact Services and Solutions that help organizations solve what truly matters. The role involves architecting and implementing MLOps strategies, designing enterprise-grade ML/LLM pipelines, and collaborating with cross-functional teams to deliver production-ready ML solutions on Google Cloud.
Responsibilities
- Architect and implement the MLOps strategy for the program, ensuring alignment with the project proposal and delivery roadmap
- Design and own enterprise-grade ML/LLM pipelines covering model training, validation, deployment, versioning, monitoring, and CI/CD automation using GCP-native services
- Build container-oriented ML platforms (GKE-first) while evaluating alternative orchestration tools with similar capabilities (Kubeflow, Vertex AI, MLflow, Airflow, etc.)
- Implement hybrid MLOps + LLMOps workflows, including prompt/version governance, evaluation frameworks, and monitoring for LLM-based systems within the GCP environment
- Serve as a technical authority across multiple internal and customer projects, contributing architectural patterns, best practices, and reusable frameworks for GCP
- Enable observability, monitoring, drift detection, lineage tracking, and auditability across ML/LLM systems using tools like Cloud Monitoring and Vertex AI Model Monitoring
- Collaborate with cross-functional teams — data engineering, platform, DevOps, and client stakeholders — to deliver production-ready ML solutions on Google Cloud
- Ensure all solutions adhere to security, governance, and compliance expectations, particularly around handling GCP services, Google Kubernetes Engine workloads, and MLOps tools
- Conduct architecture reviews, troubleshoot complex ML system issues, and guide teams through implementation across cloud-native ML platforms on GCP
- Mentor engineers and provide guidance on modern MLOps tools, Vertex AI platform capabilities, and best practices
- Travel Required - upto 30%
Skills
- 10+ years working in ML/AI platform engineering or AI/MLOps roles with strong architecture exposure
- Strong expertise in the Google Cloud (GCP) native AI/ML stack, including: Vertex AI (primary), Google Kubernetes Engine (GKE), Cloud Functions, AutoML, Vertex AI Pipelines, BigQuery ML, API Gateway, and CI/CD (Cloud Build/Cloud Deploy or equivalent)
- Hands-on experience with MLOps toolset and awareness of: MLflow, Kubeflow, Vertex AI Pipelines, Airflow, BentoML, KServe, Seldon
- Deep understanding of model lifecycle management (feature engineering -> training -> registry -> deployment -> monitoring)
- Experience implementing or supporting LLMOps pipelines, including prompt versioning, evaluation metrics, and automation frameworks
- Deep understanding of the ML lifecycle: data ingestion, feature engineering, training, evaluation, model packaging, CI/CD, drift detection, monitoring, and governance
- Strong experience with Google Cloud's Vertex AI platform, including Pipelines, Feature Store, Model Registry, and Model Monitoring
- Experience implementing ML CI/CD pipelines including automated training, testing, validation, model promotion, and endpoint deployment
- Strong SQL and data transformation experience using Snowflake, Databricks, Spark
- Experience with feature engineering pipelines and Feature Store management
- Understanding of lineage tracking: training data snapshot, feature versions, code versioning, metadata tracking, and reproducibility
- Hands-on experience with Vertex AI Foundation Models, OpenAI, Anthropic, or Llama models
- Experience with Cloud Monitoring, Vertex AI Model Monitoring, Prometheus/Grafana
- Strong foundation in Python and cloud-native development patterns
- Solid understanding of security best practices, Cloud IAM, secrets management, and artifact governance
Benefits
- Be part of the fastest-growing AI-first digital transformation and engineering company in the world
- Be a leader of an energetic team of highly dynamic and talented individuals
- Exposure to working with fortune 500 companies and innovative market disruptors
- Exposure to the latest technologies related to artificial intelligence and machine learning, data and cloud
Company Overview
Company H1B Sponsorship