[Remote] Data Engineer, MDM
Note: The job is a remote job and is open to candidates in USA. Semarchy is a data management company founded in 2011, headquartered in Arizona, USA. The company is seeking a Data Engineer to design and build data pipelines that power client implementations, collaborating closely with customer teams and ensuring optimal data integration.
Responsibilities
- Full-Lifecycle Pipeline Delivery: Own the pipeline workstream through the full implementation lifecycle — discovery, design, build, test, cutover, and stabilization
- Pipeline Design & Build: Design and build the data pipelines that move information into and out of the Semarchy platform, using Semarchy xDI and adjacent data integration tooling as needed. Configure batch, near-real-time, and streaming integration patterns to fit each customer's data landscape and cadence requirements
- SQL, Data Modeling & Data Quality: Own the SQL, data modeling, and data quality work that supports pipeline reliability, reconciliation, and downstream consumption
- Customer Collaboration & Troubleshooting: Collaborate with customers to understand their data requirements, translate them into practical solutions, and communicate progress, risk, and dependencies clearly. Troubleshoot and resolve integration issues, coordinating with Support, Product, and Engineering as needed
- Scoping Support: Provide input into effort estimation and technical scoping on prospective engagements
- Practice Building: Contribute to Semarchy's implementation methodology, templates, and reference architectures for pipeline design. Identify opportunities to productize common data engineering patterns and reduce time-to-value for future customers. Stay current with data integration trends and tools, and bring relevant ones into Semarchy's delivery practice
Skills
- 5+ years in a hands-on data engineering, ETL development, or data delivery role in enterprise environments
- Strong hands-on experience with an enterprise ETL or data integration platform (e.g., Semarchy xDI, Talend, Informatica, Matillion, dbt) — process design, data mapping, and reusable integration patterns
- Production experience with a streaming or event-driven platform (e.g., Kafka)
- Production experience with a workflow orchestration tool (e.g., Apache Airflow) for scheduling, dependencies, and monitoring
- Comfortable deploying and operating workloads in containerized environments (e.g., Kubernetes)
- Advanced SQL — production-grade queries for extraction, transformation, reconciliation, and troubleshooting, across major RDBMS or cloud data warehouses (e.g., PostgreSQL, Snowflake, Databricks)
- Working knowledge of data architecture, data modeling, and data warehousing patterns
- Comfortable in customer-facing delivery — able to run a discovery workshop, present a technical recommendation, explain trade-offs to non-experts, and write clean documentation
- Familiarity with master data management concepts (matching, survivorship, hierarchies), or a demonstrated ability to build that knowledge quickly
- Direct Semarchy experience (xDM, SDP, xDI, Snowflake Native App)
- Prior experience at a Big 4, systems integrator, or specialized data consultancy
- Certifications in cloud data platforms (Snowflake, AWS, Azure, GCP, Databricks) or streaming/containerization technologies
- Experience in a regulated or data-intensive industry (financial services, healthcare and life sciences, retail, manufacturing)
- Bachelor's or Master's degree in Computer Science, Information Systems, or a related technical discipline
Company Overview