[Remote] AI/LLM Safety Engineer
Note: The job is a remote job and is open to candidates in USA. Propio Language Services is on a mission to make communication accessible to everyone, focusing on real-time interpretation and multilingual language services. They are seeking an AI/LLM Safety Engineer to ensure the safety and reliability of AI models and agents in production, focusing on AI safety, trust, and responsible AI.
Responsibilities
- Design and maintain a safety evaluation framework—adversarial prompt sets, scenario-based test suites, and regression suites—so that every model and agent update is validated before it ships
- Lead structured red-teaming exercises covering jailbreaks, prompt injection, tool misuse, and data exfiltration; document findings and drive each issue through to remediation and closure
- Build and iterate on guardrail logic, including input/output filtering, tool-boundary constraints, action validation, sensitive-data redaction, and policy prompting
- Integrate safety checks into CI/CD and runtime so that unsafe behavior is intercepted before it reaches users
- Perform threat modeling for agentic scenarios: tool-call boundaries, sandbox isolation, and least-privilege access, with particular attention to preventing agents from exfiltrating data or executing irreversible actions through chained tool calls
- Conduct safety reviews of reinforcement-learning (RL) environments and trajectory data, partnering with environment and agent engineering teams to embed safety constraints directly into the environments themselves
- Instrument AI features for safety with structured logging, tracing, and metrics, enabling detection of unsafe patterns and regressions in production
- Prepare evidence for governance reviews—test reports, evaluation summaries, and mitigation validation—aligned with internal Responsible AI standards
- Collaborate with Product and UX to improve safety interactions (warnings, confirmations, refusal messaging, and feedback collection), and align evaluation goals with the Research and Data teams
Skills
- Bachelor's or Master's degree in Computer Science, Software Engineering, Cybersecurity, or a related technical field—or equivalent practical experience
- 4+ years building production software, with direct experience working on—or securing—ML/LLM systems
- Strong software engineering skills with the ability to write production-grade code (primarily Python), beyond scripting or notebook prototyping
- Solid understanding of LLMs and ML: how models work, prompt engineering, and the safety implications of fine-tuning and RAG (e.g., unsafe retrieval, tool misuse, and data exfiltration)
- A security mindset with demonstrated threat-modeling ability; able to threat-model AI workflows and familiar with the fundamentals of access control, data retention, and incident response
- Familiarity with the LLM attack surface—prompt injection, jailbreaks, data poisoning, and supply-chain risk—and working knowledge of the OWASP LLM Top 10
- Hands-on experience with at least one of safety evaluation or red teaming, with the ability to walk through a real finding and how it was remediated
- Hands-on experience with industry safety tooling such as garak, PyRIT, promptfoo, Giskard, and NeMo Guardrails, and the ability to articulate the trade-offs between them
- Visible output in AI safety or security: publications at relevant venues (e.g., the NeurIPS AI Safety Workshop, USENIX Security, or DEF CON AI Village), open-source contributions, or responsible disclosures on frontier models with public write-ups
- Familiarity with AI governance and compliance frameworks (NIST AI RMF, ISO/IEC 42001, EU AI Act) and the ability to translate compliance requirements into concrete engineering tasks
- Engineering experience with agents, RL environments, and/or tool use
- Practical experience with threat-modeling methodologies such as MITRE ATLAS and STRIDE/PASTA
Company Overview