← all jobs

[Remote] Remote | Machine Learning Systems Evaluation Engineer — Up to $90/hour

Work from home Full-time role Hiring

Note: The job is a remote job and is open to candidates in USA. 24-MAG is offering a specialized remote consulting opportunity for experienced machine learning engineers. The role focuses on evaluating complex machine learning and AI engineering implementations, supporting workflows related to ML system evaluation, and providing structured feedback on MLOps and deployment processes.

Responsibilities

  • Use modern coding agents to complete and evaluate complex machine learning and AI engineering tasks
  • Review generated implementations involving model training, inference systems, MLOps workflows, LLM applications, and AI-powered product features
  • Assess technical outputs for correctness, quality, maintainability, performance, reliability, and production-readiness
  • Apply professional machine learning engineering judgment to realistic technical scenarios
  • Evaluate ML system workflows involving model deployment, inference infrastructure, monitoring, testing, and production integration
  • Review implementation choices related to scalability, latency, data flow, model serving, reliability, and system maintainability
  • Identify bugs, edge cases, performance issues, failure modes, and weak assumptions in ML engineering outputs
  • Provide structured feedback on MLOps design, deployment patterns, and production ML system quality
  • Compare outputs from multiple coding agents and assess their strengths, weaknesses, accuracy, and practical usefulness
  • Identify where generated solutions succeed, where they fail, and where additional ML engineering judgment is required
  • Evaluate whether generated machine learning implementations reflect real-world engineering standards
  • Document technical review findings clearly for project teams and quality evaluation workflows
  • Produce clear, structured evaluations of machine learning engineering tasks and generated outputs
  • Explain reasoning around model training, inference systems, deployment infrastructure, LLM applications, performance, and architectural trade-offs
  • Support technical assessment workflows by documenting accepted work, improvement areas, and practical engineering conclusions
  • Help ensure outputs reflect production-scale machine learning engineering expectations

Skills

  • 2+ years of professional machine learning engineering experience
  • Hands-on experience building production ML systems, model deployment infrastructure, LLM applications, or AI-powered products
  • Regular use of AI coding agents such as Cursor, Claude Code, Codex, Windsurf, Gemini CLI, or comparable tools
  • Ability to evaluate generated machine learning implementations and identify technical trade-offs, bugs, edge cases, and performance issues
  • Strong understanding of model training, inference workflows, MLOps, data pipelines, evaluation methods, deployment patterns, and system reliability
  • Clear written communication skills and comfort documenting technical reasoning in a remote, project-based environment
  • A degree in Computer Science, Machine Learning, Artificial Intelligence, Data Science, Software Engineering, Computer Engineering, Statistics, Mathematics, or a related technical field is helpful
  • Equivalent professional experience in machine learning engineering, applied AI, MLOps, LLM applications, or production ML systems is also highly relevant
  • Experience deploying ML systems to production is strongly preferred
  • Experience with Python, PyTorch, TensorFlow, scikit-learn, Hugging Face, LangChain, LlamaIndex, MLflow, Ray, or comparable ML tools
  • Familiarity with model serving, feature pipelines, vector databases, embeddings, retrieval systems, LLM application architecture, or evaluation frameworks
  • Experience with cloud platforms, Docker, Kubernetes, CI/CD pipelines, observability tooling, or production deployment workflows
  • Background in technical code review, ML architecture review, model performance evaluation, or large-scale AI product engineering
  • Strong comfort working in sprint-based project environments with focused technical assessment windows

Benefits

  • Remote consulting work aligned with machine learning engineering, coding agent, and technical evaluation expertise
  • Fully remote and flexible scheduling
  • Sprint-based, project-based availability
  • Payments are made weekly via Stripe or Wise based on services rendered
  • Some projects may use accepted-task compensation depending on the specific workflow

Company Overview

  • At 24-MAG, we support emerging AI and consulting platforms by sourcing and connecting qualified professionals with remote, contract-based opportunities. It was founded in undefined, and is headquartered in Sheridan, Wyoming, US, with a workforce of 2-10 employees. Its website is https://24-mag.com/.
  • More open positions

    [Remote] Remote | Fraud & Risk Systems Evaluation Engineer — Up to $90/hour

    Work from home Full-time role

    [Remote] Remote | Backend Code Evaluation Engineer — Up to $90/hour

    Work from home Full-time role

    [Remote] Remote | Data Infrastructure Evaluation Engineer — Up to $90/hour

    Work from home Full-time role

    [Remote] Senior TIBCO Administrator

    Work from home Full-time role

    [Remote] Enterprise Account Executive

    Work from home Full-time role

    General Manager, Workforce

    Work from home Full-time role

    [Remote] Alteryx Data Engineer

    Work from home Full-time role

    Payroll Transition Specialist

    Work from home Full-time role

    Ulrasound Dispatcher - Remote, Thursday - Monday 11:30am - 8:00pm EST

    Work from home Full-time role

    Residential Estimator

    Work from home Full-time role

    [Remote] VP, Customer Success

    Work from home Full-time role

    Wayfair Remote WFH Jobs $30/Hour

    Work from home Full-time role

    Full-Stack Engineer(Entry)

    Work from home Full-time role

    Senior Manufacturing Master Scheduler

    Work from home Full-time role

    Experienced Remote Customer Support Specialist – Driving Customer Satisfaction and Growth at careerzynith

    Work from home Full-time role

    Entry-Level Freight Dispatcher – Remote | $2,000/Week Potential | Job ID: FD

    Work from home Full-time role

    [Remote] Business Development Leader, Mobility (Open to Remote)

    Work from home Full-time role

    [Remote] Engineering Manager, Site Experience & Lead Gen

    Work from home Full-time role

    Analyst Total Rewards - Compensation- Fully Remote Opportunity

    Work from home Full-time role

    Nurse Navigator – (Remote)

    Work from home Full-time role

    Care Manager RN (Remote) | Remote nursing jobs

    Work from home Full-time role