[Remote] Lead Data Engineer (Kafka Streaming Platform)
Note: The job is a remote job and is open to candidates in USA. First Soft Solutions LLC is seeking an experienced Lead Data Engineer with deep expertise in Apache Kafka and enterprise event streaming architectures. The role involves designing, building, and operating scalable real-time data platforms, mentoring engineering teams, and driving operational excellence for mission-critical streaming applications.
Responsibilities
- Design and implement enterprise-scale event-driven architectures using Apache Kafka and the Confluent Platform
- Define Kafka topic architecture, partitioning strategies, replication, ordering semantics, retention policies, and replay capabilities
- Establish scalable producer and consumer design patterns to support high-throughput, low-latency workloads
- Develop enterprise reference architectures and best practices for event streaming solutions
- Deploy, configure, and manage Confluent Platform components including:
- Kafka Brokers
- Schema Registry
- Kafka Connect
- KsqlDB
- Control Center
- Cluster Linking
- Design schema governance strategies using Avro, JSON Schema, and Protocol Buffers (Protobuf)
- Manage schema compatibility, evolution, and version control across enterprise applications
- Develop and optimize Kafka Connect source and sink connectors
- Implement streaming transformations, aggregations, filtering, and joins using ksqlDB
- Collaborate on advanced stream processing using Apache Flink and Confluent Cloud streaming services where applicable
- Build reliable streaming pipelines supporting real-time analytics and operational systems
- Implement Role-Based Access Control (RBAC) across Kafka, Kafka Connect, Schema Registry, and ksqlDB
- Integrate Kafka security with enterprise identity providers and authentication services
- Configure encryption, authorization, secure networking, and Bring Your Own Key (BYOK) capabilities where applicable
- Enforce least-privilege access models and enterprise governance policies
- Design resilient multi-cluster and multi-region Kafka architectures
- Implement Cluster Linking for hybrid cloud, disaster recovery, and cloud migration
- Configure offset-preserving replication strategies
- Implement Tiered Storage to optimize infrastructure costs while maintaining high-performance access to active data
- Support deployments across Microsoft Azure, Google Cloud Platform (GCP), and Confluent Cloud
- Define Service Level Indicators (SLIs) and Service Level Objectives (SLOs) for streaming platforms
- Develop monitoring, alerting, logging, dashboards, and operational runbooks
- Perform capacity planning, performance tuning, and incident management
- Establish disaster recovery and business continuity procedures
- Drive continuous platform optimization and operational improvements
- Automate infrastructure provisioning using Infrastructure as Code (Terraform, ARM, or similar tools)
- Develop CI/CD pipelines for Kafka connectors, streaming applications, and configuration management
- Standardize onboarding processes for producer and consumer application teams
- Support automated deployments and configuration management across environments
- Mentor Data Engineers and Streaming Platform Engineers
- Define enterprise standards for:
- Event contracts
- Naming conventions
- Topic management
- Schema versioning
- Testing methodologies
- Reference architectures
- Lead architecture reviews and technology evaluations
- Collaborate with enterprise architects, platform teams, and business stakeholders
Skills
- Minimum 8+ years of experience in Data Engineering or Streaming Platform Engineering
- Extensive hands-on experience with Apache Kafka and Confluent Platform
- Strong expertise designing enterprise event-driven architectures
- Experience implementing Schema Registry, Kafka Connect, ksqlDB, and Control Center
- Deep understanding of distributed systems, messaging architectures, and streaming data platforms
- Experience implementing Kafka security, RBAC, authentication, and encryption
- Knowledge of multi-region, disaster recovery, and hybrid cloud architectures
- Experience with Infrastructure as Code and CI/CD automation
- Strong understanding of cloud platforms including Microsoft Azure and Google Cloud Platform
- Excellent leadership, mentoring, and communication skills
Company Overview
Company H1B Sponsorship