[Remote] Senior Machine Learning Ops Engineer
Note: The job is a remote job and is open to candidates in USA. Sheetz, Inc is a company that operates stores and offers various services, and they are seeking a Senior Machine Learning Ops Engineer. This role is responsible for ensuring that AI models transition smoothly from development to production, maintaining the operational integrity of machine learning systems and enhancing business capabilities through scalable ML solutions.
Responsibilities
- Lead the end-to-end development and optimization of ML pipelines, including training, validation, deployment, monitoring, and retraining workflows at scale
- Guide the use of and implement infrastructure for tools such as ML flow, TensorFlow, PyTorch, Docker, and Kubernetes to support scalable production workflows for model deployment and lifecycle management
- Design and monitor tools for performance monitoring, drift detection, and automated alerting
- Develop CI/CD pipelines to enable safe, rapid model iteration, deployment, and retraining across environments
- Write, review, and maintain high-quality, production ready code, ensuring robust, reproducible, and secure ML systems
- Apply advanced software engineering and ML Ops best practices to operationalize machine learning solutions efficiently and reliably
- Collaborate with cross-functional teams to align ML solutions with business needs and system requirements and guide integration efforts to embed ML into production applications
- Maintain thorough documentation, version control, metadata tracking, and lineage to support reproducibility and compliance of ML models
- Recommend and implement improvements to ML infrastructure, frameworks, and operational standards, elevating the organization’s ML maturity and capabilities
- Mentor and coach junior engineers, providing guidance on technical challenges, workflow design, and career development
Skills
- Bachelor's degree in Computer Science, Management Information Systems, Computer Engineering, or related discipline
- Minimum 5 years hands-on experience in designing, developing, and operationalizing machine learning solutions, with a strong focus on ML Ops practices and infrastructure
- Previous experience working with large databases – both structured and unstructured – to build data pipelines and self-service dashboards for business users
- Previous experience in managing machine learning pipelines, lifecycle management, and deployment at scale—including training, validation, serving, and monitoring
- Previous experience with CI/CD pipelines for ML workflows and containerization tools such as Docker and Kubernetes
- Previous experience with secure and scalable cloud environments (e.g., AWS, GCP, Azure) and infrastructure-as-code and platform-as-a-service (PaaS) offerings
- Cloud Platforms (AWS, GCP, Azure)
- MLOps tools and frameworks (e.g., ML Flow, Kubeflow, TFX)
- DevOps certifications (e.g. Docker, Kubernetes, Terraform, CI/CD Tools)
Benefits
- Quarterly employee bonuses based on company performance
- Competitive salaries
- PTO and parental leave
- 401k match and employee stock ownership
- Limitless professional development and growth opportunities
- Tuition reimbursement
- Full medical, vision and dental coverage
- Snack discounts
- Remote work arrangement within our 7 state footprint (PA, OH, MI, WV, VA, MD, NC)
Company Overview