[Remote] Principal AI / Machine Learning Data Engineer - Remote or hybrid from MN or DC

Work from home Full-time role Hiring

Note: The job is a remote job and is open to candidates in USA. UnitedHealth Group is a global organization that delivers care aided by technology to help millions of people live healthier lives. The Principal AI Data Engineer will design and build end-to-end AI pipelines for large-scale unstructured data, enabling advanced analytics and Generative AI. This role involves transforming complex datasets into AI-ready data products and supporting machine learning workflows.

Responsibilities

Design, develop, and maintain scalable data pipelines and data platforms supporting analytics, machine learning, and AI use cases
Build and optimize ingestion frameworks for large-scale structured and unstructured data, including streaming and event-driven sources
Partner with cross-functional stakeholders to understand evolving data and AI needs and define long-term technical solutions
Enable and support machine learning and AI workflows, including feature engineering, data preparation, and model deployment support
Drive strategic initiatives around Generative AI, data quality, observability, lineage, and governance
Develop and maintain frameworks that support rapid experimentation and deployment of AI/ML solutions
Introduce and evolve best practices in data modeling, orchestration, testing, and monitoring
Identify and champion opportunities for platform scalability, performance optimization, and cost efficiency
Collaborate with product, analytics, and infrastructure teams to deliver high-impact data and AI solutions
Build and maintain reusable parsing, enrichment, analytic, and service libraries to accelerate delivery across teams
Work comfortably under time-sensitive conditions while ensuring thoroughness
Maintain high ethical standards and the ability to remain objective and confidential
You will be building and operating production data platforms and pipelines across batch and streaming workloads
Working hands-on engineering in Python and SQL; in a JVM languages (Java/Scala) Spark ecosystems
Distributed processing and lakehouse/warehouse patterns (eg, Spark/PySpark, Databricks, Snowflake)
Build pipelines for OCR, document parsing, and text extraction from image-based or scanned data sources
Enabling Generative AI solutions in production (eg, RAG-style architectures), including retrieval patterns and evaluation/monitoring practices
Take a knowledge-centric data approaches (eg, metadata-driven systems, entity resolution, and/or graph concepts) to improve discoverability and downstream analytics
Data quality, observability, and monitoring mindset (profiling, validation, alerting, and reliability improvements)
Orchestrate, CI/CD, containerization, and infrastructure-as-code (eg, Airflow, GitHub Actions, Docker, Terraform, Kubernetes)
Work in the Cloud (AWS, Azure, and/or GCP), including secure handling of sensitive data (PII/PHI) and collaboration with compliance partners
Lead through influence, mentor engineers, and translate ambiguous problems into scalable technical roadmaps

Skills

Bachelor's degree or equivalent experience
5+ years of experience designing, building, and operating scalable data pipelines and platforms (batch + streaming)
2+ years of experience deploying Generative AI solutions to production (e.g., RAG, LLM-powered pipelines, semantic search)
Proven solid hands-on development in Python and SQL, with experience in Spark/PySpark and Databricks (or similar distributed platforms)
Experience building ingestion and processing frameworks for unstructured data (OCR, documents, images), including parsing and enrichment
Experience with cloud platforms (AWS/Azure/GCP), DevOps/CI/CD, and infrastructure-as-code, including secure handling of sensitive data (PII/PHI)
Proven ability to design scalable solutions, implement data quality/observability practices, and collaborate across stakeholders
Experience with cloud platforms such as AWS, Azure, or Google Cloud, including managed data services
Experience with streaming and event-driven architectures (e.g., Kafka, Kinesis, Event Hubs)
Experience with data quality and validation frameworks (e.g., Great Expectations, Deequ) and/or data observability tooling
Experience enabling MLOps practices (e.g., feature stores, model registries, experiment tracking, deployment automation)
Experience with lakehouse architectures, Delta Lake, and advanced Spark optimization/performance tuning
Experience with data visualization tools and libraries such as Plotly, seaborn, and Chartjs
Experience with machine learning and predictive analytics
Familiarity with security and privacy concepts for data platforms (e.g., least privilege, PII/PHI handling) and working with compliance partners
Solid hands-on engineering in Python and SQL; familiarity with JVM languages (Java/Scala) in Spark ecosystems

Benefits

A comprehensive benefits package
Incentive and recognition programs
Equity stock purchase
401k contribution (all benefits are subject to eligibility requirements)

Company Overview

UnitedHealth Group is a medical insurance company that offers health technology, patient checkups, and pharmacy services. It was founded in 1977, and is headquartered in Minneapolis, Minnesota, USA, with a workforce of 10001+ employees. Its website is https://www.unitedhealthgroup.com/.

Apply Now

[Remote] Principal AI / Machine Learning Data Engineer - Remote or hybrid from MN or DC

More open positions

[Remote] Data Analyst - 2373621

[Remote] Staff Engineer Cloud Software

[Remote] Enterprise Account Executive

[Remote] Senior Director - Strategic Finance Transformation

[Remote] Social Media Coordinator

Licensed Mental Health Therapist

IT Infrastructure Administrator

Lead, National Credentialing

Director Sales & Strategy

Sterilization Area Clinical Leader - Midwest

Senior DevOps Engineer ID69152

Remote Data Entry Specialist – Precision Data Management for Aviation Operations at careerzynith (Work‑From‑Home)

Veterinary Assistant (Remote) - The Elite Job

[Remote] New Business Development Advisor

Customer Support Representative

Client Support Specialist

DevOps Engineer (Remote Opportunity)

[Remote] Senior Analytics Engineer

Head of Business Systems & Operations

Trust and Safety Manager, Critical Response

Filevine Transcriber