Data Analysis and Statistical Modeling Scientist
About OpenTrain OpenTrain is the leading platform for building careers in AI training and data labeling. Contributors use OpenTrain to discover projects, build a unified portfolio, and grow a flexible freelance career teaching AI systems with real human expertise. About AI training work AI training (also called data labeling or human feedback work) is the human side of building intelligent systems: people prepare, review, and shape the examples models learn from. This role gives you the chance to influence how models reason and perform while working remotely and flexibly. The role — high level OpenTrain is recruiting for a Data Analysis and Statistical Modeling Scientist to support training of next-generation AI systems. The role blends hands-on data work with statistical modeling, visualization, and collaborative project delivery in a remote contractor role.
- Commitment: 20+ hours/week (part-time, contractor).
- Experience level: Entry level (suitable for early-career data scientists).
- Pay: Hourly rate between $30–$100, up to $100/hr (USD).
- Location: Remote, worldwide; English required.
What you'll do
- Collect, clean, and preprocess diverse datasets to ensure integrity and readiness for analysis.
- Develop, validate, and implement statistical models to extract actionable insights from complex data.
- Perform exploratory data analysis to identify trends, patterns, and opportunities.
- Create compelling visualizations and reports to present findings and recommendations.
- Collaborate with cross-functional team members to design and execute end-to-end data projects.
- Improve analytical methodologies and automate routine processes where possible.
- Communicate analyses clearly in both written and verbal formats to stakeholders.
Minimum requirements You must meet the core technical skills and experience listed below. These are required for the role and will form the basis of any evaluation or onboarding tasks.
- Expertise in statistics, mathematics, and data analysis techniques.
- Proven experience collecting, handling, and cleaning large, complex datasets.
- Proficiency in Python, R, or similar languages for data manipulation and modeling.
- Strong data modeling skills, including building and validating predictive models.
- Advanced data visualization ability using Tableau, Power BI, or visualization libraries.
- Meticulous attention to data quality and detail at every stage.
- Excellent written and verbal communication skills with emphasis on clarity.
Helpful background (preferred)
- Experience working remotely in cross-functional or customer-focused teams.
- Background developing and deploying machine learning solutions to production.
Project details & tools This engagement focuses on text data and combines data collection and evaluation tasks to support modeling workflows. Labeling and tooling may use proprietary or third-party systems described as 'OTHER' in project metadata.
- Primary data type: TEXT.
- Labeling tasks: DATA_COLLECTION and EVALUATION_RATING.
- (proprietary or non-listed tools may be used).
- Employment types: Contractor, Part-time.
How to apply and next steps To apply, create an OpenTrain account (free) and submit your profile and resume. Include examples of relevant data projects, code samples, or visualizations if available. Qualified candidates will be contacted with next steps and any brief evaluation tasks required for onboarding.
- Prepare a short portfolio or links to notebooks, dashboards, or model code when possible.
- Applications are reviewed based on skills, portfolio, and alignment with project needs.