[Remote] Mid-Level Data Engineer, Veterans Affairs

Work from home Full-time role Hiring

Note: The job is a remote job and is open to candidates in USA. ThunderYard Solutions is seeking a Data Engineer to support the U.S. Department of Veterans Affairs in designing, developing, and maintaining scalable data solutions. The role involves collaborating with cross-functional teams to optimize data pipelines and ensure compliance with federal standards.

Responsibilities

Design, develop, and maintain ETL/ELT pipelines to ingest, transform, and load data from multiple sources such as APIs, relational databases, cloud storage, and streaming platforms
Build scalable batch and near real time data pipelines using Databricks and Apache Spark (PySpark / SQL)
Implement data transformation logic following best practices for performance, reliability, and reusability
Support schema evolution, data validation, deduplication, and error handling in ETL workflows
Develop and optimize pipelines using Delta Lake and medallion (Bronze / Silver / Gold) architecture patterns
Use Databricks Workflows / Jobs or similar orchestration tools to schedule and monitor pipelines
Optimize Spark jobs for performance and cost (partitioning, caching, file sizing, query tuning)
Collaborate on data governance initiatives using Unity Catalog, access controls, and lineage where applicable
Work closely with data architects, analytics teams, and downstream consumers to define data requirements
Troubleshoot pipeline failures and data quality issues and implement long term fixes
Produce documentation for pipelines, datasets, and operational runbooks
Participate in CI/CD practices using Git based version control for notebooks and code deployments

Skills

3+ years of experience as a Data Engineer or in a similar data focused role
Hands on experience with Databricks
Strong experience building ETL/ELT pipelines
Proficiency in Python and SQL
Experience with Apache Spark / PySpark
Familiarity with cloud platforms such as Azure
Solid understanding of data modeling, data warehousing, and analytics use cases
Design, develop, and maintain ETL/ELT pipelines to ingest, transform, and load data from multiple sources such as APIs, relational databases, cloud storage, and streaming platforms
Build scalable batch and near real time data pipelines using Databricks and Apache Spark (PySpark / SQL)
Implement data transformation logic following best practices for performance, reliability, and reusability
Support schema evolution, data validation, deduplication, and error handling in ETL workflows
Develop and optimize pipelines using Delta Lake and medallion (Bronze / Silver / Gold) architecture patterns
Use Databricks Workflows / Jobs or similar orchestration tools to schedule and monitor pipelines
Optimize Spark jobs for performance and cost (partitioning, caching, file sizing, query tuning)
Collaborate on data governance initiatives using Unity Catalog, access controls, and lineage where applicable
Work closely with data architects, analytics teams, and downstream consumers to define data requirements
Troubleshoot pipeline failures and data quality issues and implement long term fixes
Produce documentation for pipelines, datasets, and operational runbooks
Participate in CI/CD practices using Git based version control for notebooks and code deployments
Experience with Delta Live Tables (DLT) or Databricks Auto Loader
Experience with orchestration tools such as Airflow
Familiarity with streaming data technologies (Kafka, Event Hubs, Kinesis)
Experience supporting analytics tools (Power BI, Tableau, Looker) connected to Databricks
Databricks certification (Associate or Professional)

Benefits

Medical, dental and vision insurance
401k matching
PTO
Certification reimbursement

Company Overview

We are a force of industry-leading talent dedicated to advancing cloud-native solutions that transform outcomes for customers everywhere. It was founded in 2019, and is headquartered in Baltimore, Maryland, USA, with a workforce of 51-200 employees. Its website is https://thunderyard.com.

Apply Now

[Remote] Mid-Level Data Engineer, Veterans Affairs

More open positions

[Remote] Field Marketing Manager (m/f/d) - US

[Remote] Senior Product Manager (m/w/d)

[Remote] Clinical Informaticist

[Remote] PRODUCT MANAGER V

[Remote] Senior Clinical Advisor (Enterprise)

Zahlen im Kopf, Menschen im Fokus – starte deine Karriere in der Finanzberatung (German Speaking)

Broker Specialist/E&S Underwriting - Manufacturing & Products Liability

1099 Medical Writers

Senior Solutions Architect - Health Insurance Large Group Product Configuration- REMOTE

Freelance Project Manager - Medical Communications

Mobile Developer (iOS en Android)

Utility Scale Development Associate

Experienced Customer Support Representative – Remote Data Entry and Customer Service

Software Engineer, Platform - Milan, Italy

Costing Manager II (EMEA or LATAM)

Go-to-Market Engineer - Belgrade, Serbia

Remote Customer Service Representative – Travel

Solutions Architect (Enterprise Applications)

Crisis Hotline Phone Counselor (Part-Time)

Remote Easy Typing Jobs: Work from Home Opportunity

[Remote] Senior Program Manager, NPI Global Customer Services (Contract)