[Remote] Data Engineer (Return-to-Work Program)
Note: The job is a remote job and is open to candidates in USA. Precision Technologies is offering a return-to-work program for women seeking to re-enter the workforce. The Data Engineer role involves designing and maintaining data pipelines, utilizing big data technologies, and building cloud-native data platforms to support enterprise data integration solutions.
Responsibilities
- Minimum 4+ years of experience in designing, developing, and maintaining scalable data pipelines, ETL/ELT workflows, and enterprise data integration solutions
- Expertise in Python, SQL, PySpark, Spark SQL, Scala, and distributed data processing frameworks for handling large-scale datasets
- Experience with big data technologies including Apache Spark, Databricks, Hadoop, Airflow, Kafka, and Snowflake for modern data engineering workloads
- Experience building cloud-native data platforms using AWS, Azure, or GCP, with a strong understanding of scalable and highly available data architectures
- Working knowledge of cloud services such as AWS S3, Glue, Redshift, Athena, EMR, Lambda, Kinesis, or Azure Data Factory, Synapse Analytics, ADLS, Databricks, and Event Hubs
- Experience designing and implementing data warehouses, dimensional models, star schemas, snowflake schemas, data lakes, and lakehouse architectures
- Experience building and optimizing batch processing and real-time streaming data pipelines using technologies such as Kafka, Spark Streaming, Flink, or Kinesis
- Experience handling structured, semi-structured, and unstructured data using file formats including Parquet, Avro, ORC, CSV, and JSON
- Experience working with relational and NoSQL databases such as PostgreSQL, MySQL, Oracle, MongoDB, Cassandra, and DynamoDB, including query optimization and performance tuning
- Familiarity with CI/CD pipelines, DevOps practices, infrastructure automation, and version control systems using Git, Jenkins, GitHub Actions, Azure DevOps, or GitLab CI/CD
- Understanding of data quality, data governance, data security, monitoring, observability, partitioning strategies, and troubleshooting distributed systems
Skills
- Minimum 4+ years of experience in designing, developing, and maintaining scalable data pipelines, ETL/ELT workflows, and enterprise data integration solutions
- Expertise in Python, SQL, PySpark, Spark SQL, Scala, and distributed data processing frameworks for handling large-scale datasets
- Experience with big data technologies including Apache Spark, Databricks, Hadoop, Airflow, Kafka, and Snowflake for modern data engineering workloads
- Experience building cloud-native data platforms using AWS, Azure, or GCP, with a strong understanding of scalable and highly available data architectures
- Working knowledge of cloud services such as AWS S3, Glue, Redshift, Athena, EMR, Lambda, Kinesis, or Azure Data Factory, Synapse Analytics, ADLS, Databricks, and Event Hubs
- Experience designing and implementing data warehouses, dimensional models, star schemas, snowflake schemas, data lakes, and lakehouse architectures
- Experience building and optimizing batch processing and real-time streaming data pipelines using technologies such as Kafka, Spark Streaming, Flink, or Kinesis
- Experience handling structured, semi-structured, and unstructured data using file formats including Parquet, Avro, ORC, CSV, and JSON
- Experience working with relational and NoSQL databases such as PostgreSQL, MySQL, Oracle, MongoDB, Cassandra, and DynamoDB, including query optimization and performance tuning
- Familiarity with CI/CD pipelines, DevOps practices, infrastructure automation, and version control systems using Git, Jenkins, GitHub Actions, Azure DevOps, or GitLab CI/CD
- Understanding of data quality, data governance, data security, monitoring, observability, partitioning strategies, and troubleshooting distributed systems
Company Overview
Company H1B Sponsorship