[Remote] Principal Analyst, Data Integration
Note: The job is a remote job and is open to candidates in USA. H1 is a company dedicated to providing equitable healthcare information globally. As a Principal Analyst in Data Integration, you will manage the evaluation and onboarding of new data sources, ensuring they meet product needs and are integrated effectively into the platform.
Responsibilities
- Lead structured evaluation of new data sources from scratch — assessing schema, coverage, freshness, legal constraints, and fit against H1's product needs before any engineering work begins
- Own field mapping from source to H1's bronze/silver/gold layers, producing data dictionaries, entity definitions, and structural guidance for downstream teams
- Partner with engineering and Data Lake to define ingestion requirements, entity resolution rules, and refresh cadences for new sources
- Gather requirements from client-facing teams and translate them into integration specifications; serve as the authoritative voice on what a new source can and cannot deliver before product commitments are made
- Shepherd each source end-to-end: scoping → QA → entity matching → product launch, including product QA and communicating source capabilities and limitations to product and enablement partners
- Work with the Insights team to develop new taxonomies and QA mechanisms for novel data types
- Define acceptance criteria and lead QA validation including field-level fill rates, count comparisons, and cycle-over-cycle anomaly detection
- Investigate and resolve data quality issues post-integration, coordinating with DART and engineering as needed
- Hand off to the maintaining team with complete mapping documentation; you own onboarding, not ongoing maintenance
- Produce and maintain documentation other people actually use — across scoping assessments, field mapping specs, and post-mortems
Skills
- 8–12+ years in data-focused roles at healthcare data companies, pharma/biotech data vendors, health IT firms, or equivalent
- Demonstrated end-to-end ownership of data integrations built from scratch — scoping, field mapping, QA, and handoff — with documentation to show for it
- Healthcare or life sciences domain context required; ability to ramp on new datasets and source types each quarter without needing deep subject matter expertise upfront
- Analytical fluency to assess data quality; hands-on experience with tools such as VBA, R, or SPSS; SQL a plus but not a primary requirement
- Familiarity with data lake architectures (bronze/silver/gold or equivalent) and how raw data moves through normalization and entity resolution to a product-ready state
- Experience gathering requirements from client-facing stakeholders and translating them into data or product specifications
- Experience at a B2B data company where you understood how external clients consumed your data and where client retention drove decisions
- Exceptional written communication — your documentation is legible, maintained, and actually used
- AWS infrastructure familiarity (Athena, S3, Glue) at a query and inspection level preferred
- Comfort working in Jira or Monday in a ticket-based workflow
Benefits
- Full suite of health insurance options, in addition to generous paid time off
- Pre-planned company-wide wellness holidays
- Retirement options
- Health & charitable donation stipends
- Impactful Business Resource Groups
- Flexible work hours & the opportunity to work from anywhere
Company Overview
Company H1B Sponsorship