[Remote] Sr Data Science Engineer
Note: The job is a remote job and is open to candidates in USA. LegitScript is dedicated to making the internet and payment ecosystems safer and more transparent. The Sr Data Science Engineer will own the full lifecycle of model development, from data ingestion to deployment, focusing on building sophisticated risk detection systems using advanced machine learning techniques.
Responsibilities
- Research, prototype, and develop ML and LLM-based models to solve complex business problems, with a current focus on risk detection and prioritization
- Wrap models into production-ready APIs and integrate them into our core product
- Ensure model outputs are interpretable — translating predictions into actionable reason codes for end users
- Partner directly with operational teams to gather feedback, refine features, and improve model relevance over time
- Design, build, and maintain scalable pipelines to ingest data from disparate sources into our data warehouse/lake
- Implement robust data validation, quality checks, and transformation workflows across raw, curated, and serving layers
- Build and maintain curated datasets optimized for both analytics and model training use cases
- Implement and maintain CI/CD pipelines for both data workflows and ML model deployment across environments
- Monitor pipeline latency, data drift, and model performance in production; design alerting and retraining triggers
- Own the business outcomes of your models — define success metrics, track ROI, and iterate based on real-world efficacy
- Manage infrastructure as code and containerized deployments to ensure reproducible, environment-consistent releases
Skills
- 5–8+ years spanning data engineering and data science/ML, with a demonstrated track record of shipping models to production
- Strong Python proficiency; experience with Spark/PySpark for large-scale data processing
- Advanced SQL for complex transformation, analysis, and data modeling
- Hands-on experience with cloud data platforms such as Databricks or Snowflake
- Experience with ETL/ELT frameworks — dbt, Lakeflow Declarative Pipelines, Databricks Autoloader, Informatica, or similar
- Familiarity with ML experiment tracking tools such as MLflow or Weights & Biases
- DevOps fluency: Git-based development, branching strategies, CI/CD, IaC (DABs/Terraform), and Docker
- Experience with orchestration tools such as Databricks Workflows or Apache Airflow
- Hands-on experience with LLMs and Generative AI techniques in a production context (prompt engineering, RAG architectures, fine-tuning, or evaluation frameworks)
- Experience building or operating ML platforms, feature stores, or model registries
- Prior work in risk, compliance, fraud detection, or other high-stakes ML domains
Benefits
- Competitive compensation
- Flexible work options
- A team that's genuinely invested in your success
Company Overview