[Remote] Senior Data Engineer
Note: The job is a remote job and is open to candidates in USA. Bright Vision Technologies is a forward-thinking software development company dedicated to building innovative solutions that help businesses automate and optimize their operations. They are seeking a skilled Senior Data Engineer to architect, design, develop, and maintain enterprise-grade data platforms and scalable data pipelines that support analytics and business intelligence initiatives.
Responsibilities
- Design, build, and continuously refine scalable batch and real-time data pipelines using Python, SQL, Spark, Scala, or equivalent technologies, ensuring reliable, efficient, and high-performance data movement across enterprise systems while supporting evolving business and analytical requirements
- Author secure, reusable, and production-quality ETL/ELT workflows that adhere to enterprise coding standards, data governance policies, data quality principles, and security best practices, incorporating validation, encryption, auditing, and error handling throughout the data lifecycle
- Develop scalable data integration solutions using modern cloud data platforms such as AWS, Azure, or Google Cloud, leveraging services including Databricks, Snowflake, BigQuery, Redshift, Synapse Analytics, Data Factory, Glue, or equivalent technologies to enable enterprise data processing
- Design and implement robust data architectures, dimensional data models, data lakes, data warehouses, and streaming data solutions that integrate multiple structured, semi-structured, and unstructured data sources while ensuring consistency, scalability, and high availability
- Actively participate in enterprise data architecture discussions, cloud migration initiatives, technical design reviews, and solution planning sessions by evaluating trade-offs involving scalability, performance, maintainability, governance, security, and operational costs
- Continuously monitor, profile, and optimize ETL processes, Spark jobs, SQL queries, database performance, storage utilization, partitioning strategies, and pipeline throughput by identifying bottlenecks and implementing measurable performance improvements
- Implement and maintain robust metadata management, data cataloging, lineage tracking, schema evolution, data quality validation, monitoring, and governance frameworks that ensure trusted, discoverable, and compliant enterprise data assets
- Develop comprehensive automated testing frameworks for data pipelines, ETL workflows, data validation, reconciliation, integration testing, and performance testing using modern testing methodologies and data quality tools to ensure reliable production deployments
- Contribute meaningfully to CI/CD pipeline design, infrastructure automation, and deployment processes using Jenkins, GitHub Actions, Azure DevOps, Terraform, Docker, Kubernetes, or equivalent technologies, enabling consistent and automated delivery of enterprise data solutions
- Proactively identify data pipeline bottlenecks, operational risks, technical debt, scalability challenges, and architectural weaknesses while driving continuous improvement initiatives through optimization, refactoring, technical documentation, and engineering best practices
- Collaborate effectively within Agile/Scrum delivery teams by participating in sprint planning, backlog refinement, daily standups, architecture discussions, sprint reviews, and retrospectives to ensure consistent delivery of scalable, high-quality data engineering solutions
- Maintain clear, current, and comprehensive technical documentation—including data architecture diagrams, pipeline specifications, ETL workflows, metadata documentation, deployment guides, operational runbooks, and disaster recovery procedures—to ensure maintainability, governance, and knowledge sharing across teams
Skills
- Bachelor's degree in Computer Science, Information Technology, Data Engineering, Software Engineering, Mathematics, or a closely related technical discipline
- Five or more years of professional experience designing, developing, and supporting production-grade enterprise data engineering solutions, ETL pipelines, and cloud-based data platforms
- Strong, demonstrable understanding of data structures, database design, distributed computing, data modeling, ETL/ELT methodologies, data warehousing concepts, and large-scale data architecture principles
- Advanced working knowledge of Python, SQL, Spark, Scala, Java, and enterprise data engineering frameworks used to build scalable, high-performance data processing solutions
- Hands-on, production-level experience designing and operating batch processing, streaming data pipelines, data lakes, and cloud-native data platforms using technologies such as Databricks, Snowflake, Apache Spark, Kafka, Airflow, or equivalent solutions
- Proven experience working with relational and NoSQL databases including PostgreSQL, SQL Server, Oracle, MySQL, MongoDB, Cassandra, or equivalent database technologies, including schema design, query optimization, indexing strategies, and performance tuning
- Strong SQL skills and meaningful experience designing dimensional models, star schemas, snowflake schemas, data marts, partitioning strategies, indexing, and enterprise-scale data warehouse solutions
- Solid experience with Git-based version control, CI/CD pipelines, DevOps practices, release management, infrastructure automation, and Agile software development methodologies supporting enterprise data engineering initiatives
- Hands-on experience deploying enterprise data platforms and analytics solutions on AWS, Azure, or Google Cloud Platform, including managed storage, compute, networking, security, identity management, and data integration services
- Strong troubleshooting, analytical thinking, debugging, root-cause analysis, communication, and documentation skills, with the ability to investigate complex data processing issues methodically and implement scalable, maintainable engineering solutions
- Experience designing and implementing event-driven architectures, real-time data streaming platforms, Apache Kafka, Apache Flink, Apache NiFi, RabbitMQ, or equivalent enterprise messaging and streaming technologies
- Familiarity with containerization, orchestration, Infrastructure as Code, and cloud-native deployment practices using Docker, Kubernetes, Terraform, Helm, or equivalent enterprise automation technologies
- Exposure to distributed systems concepts including eventual consistency, fault tolerance, distributed transactions, data replication, partitioning strategies, CAP theorem, high availability, and large-scale data processing architectures
- Experience implementing data governance frameworks, master data management (MDM), data lineage, metadata management, data quality automation, security compliance, and DataOps best practices within enterprise cloud and Agile development environments
Benefits
- Competitive base salary commensurate with experience, plus benefits.
- No new H1B sponsorship available. H1B transfers welcomed for qualified candidates.
- Full-time, direct W2 with Bright Vision Technologies (no C2C, no 1099, no third-party).
- Long-term, multi-year, aligned to the Bright Vision SOW delivery roadmap.
- We will support H1B transfers for qualified candidates.
Company Overview