← all jobs

[Remote] Senior AI Quality Engineer (LLM Evaluation & Automation) 1754

Work from home Full-time role Hiring

Note: The job is a remote job and is open to candidates in USA. Softgic is a technology company seeking a Senior AI Quality Engineer to own the evaluation harness and quality gate for measurable agent quality. This role involves building and maintaining the eval harness, integrating evaluations into CI, and defining release-gate thresholds.

Responsibilities

  • Build and maintain the MVP eval harness: golden tasks, exception tasks, scorecard metrics, and regression packs
  • Wire evals into CI so quality regressions fail builds and releases
  • Define and maintain release-gate thresholds with Product and the Tech Lead
  • Lay the path for later adversarial and drift-testing expansion without overbuilding MVP scope

Skills

  • Experience evaluating ML, LLM, or non-deterministic systems
  • Strong test and benchmark design capability
  • Comfort working with noisy metrics, thresholds, and probabilistic behavior
  • Good scripting and automation skills

Company Overview

  • Impulsamos la transformación digital y cognitiva de las empresas mediante soluciones tecnológicas innovadoras y personalizadas que optimizan procesos, reducen costos y aceleran resultados. It was founded in 2011, and is headquartered in Sabaneta, Antioquia, COL, with a workforce of 51-200 employees. Its website is https://softwareestrategico.com.
  • More open positions

    [Remote] Financial Planning Consultant

    Work from home Full-time role

    [Remote] Account Executive

    Work from home Full-time role

    [Remote] Data Governance Consultant(Retail Exp. Must)

    Work from home Full-time role

    [Remote] Senior Account Executive

    Work from home Full-time role

    [Remote] Lead Product Insights Analyst

    Work from home Full-time role

    Account Representative

    Work from home Full-time role

    Tax Director

    Work from home Full-time role

    [Remote] AI Studio- Client Success & Operations Manager

    Work from home Full-time role

    Senior Rust Software Engineer - Banking Platform (m/f/x)

    Work from home Full-time role

    Virtual MD - Medical Doctor

    Work from home Full-time role

    Controller - Job ID 3111

    Work from home Full-time role

    2D Illustrator/Animator for Diagram‑Driven Kurzgesagt‑Style Animation

    Work from home Full-time role

    Web/Front-End Developer

    Work from home Full-time role

    Full Board Specialist- REMOTE

    Work from home Full-time role

    [Remote] Intake Specialist - Remote

    Work from home Full-time role

    Remote Live Chat Support Specialist – Full‑Time & Part‑Time – Customer Experience & Technical Assistance at careerzynith

    Work from home Full-time role

    RCS - Quality Expert CC

    Work from home Full-time role

    Emergency Manager - Preparedness

    Work from home Full-time role

    Senior Director of Growth

    Work from home Full-time role

    Experienced Customer Support Specialist – German Desk for careerzynith

    Work from home Full-time role

    [Remote] Staff Engineer – Platform Engineering

    Work from home Full-time role