[Remote] Data engineer(AEP)//REMOTE
Note: The job is a remote job and is open to candidates in USA. NAVA Software Solutions is seeking a hands-on AXP Data Engineer responsible for designing, building, and operating scalable data pipelines. The role involves working primarily within Databricks to manage data ingestion and transformation, ensuring high-quality data is available for decision-making and analytics.
Responsibilities
- Design and build batch and streaming data pipelines in Databricks (Delta Live Tables, Structured Streaming, and Databricks Workflows) to ingest, transform, and serve data from AEP, AJO, Kafka, and Business Platform sources
- Develop and maintain Delta Lake tables, applying medallion architecture (Bronze → Silver → Gold) patterns for raw ingestion, data cleansing, enrichment, and aggregation layers
- Implement Kafka consumers within Databricks to process real-time AXP exposure events (Sent, Delivered, Clicked, Opened, Disposition) and Business recommendation signals
- Integrate with AEP datasets and Data Distiller to extract, query, and transform profile attributes, segment membership, and behavioral event data for downstream consumption
- Build and maintain data models that support CJA reporting, Business State Machine inputs, and ML feature engineering use cases
- Enforce data quality checks, schema validation, and reconciliation logic across pipeline stages to ensure accuracy, completeness, and consistency
- Optimize Databricks pipeline performance: cluster configuration, auto-scaling, partitioning, caching, Z-ordering, and query plan tuning for large-scale event datasets
- Maintain schema registry alignment and XDM-compatible data structures across all pipeline outputs
- Support data governance standards: lineage documentation, metadata cataloging (Unity Catalog), access controls, and PII handling policies
- Monitor pipeline health via structured logging, alerting, and SLA dashboards; triage and resolve data incidents in production
- Collaborate with the Data Architect, CJA Architect, AXP Architect, and BI/Analytics teams to align pipeline designs with reporting and analytics requirements
Skills
- 10+ years of data engineering experience, with significant hands-on Databricks delivery in production environments
- Proven experience building and operating both batch and streaming pipelines at scale using Delta Lake and Spark
- Experience integrating with Kafka or other real-time event-streaming platforms as a consumer
- Databricks – Delta Live Tables, Structured Streaming, Databricks Workflows, cluster management, Unity Catalog
- Delta Lake / Lakehouse architecture – medallion design patterns, ACID transactions, time travel, schema evolution
- PySpark and/or Scala Spark – large-scale data transformation, aggregations, windowing, and joins
- SQL – complex query authoring, performance tuning, and Data Distiller query patterns for AEP datasets
- Kafka – real-time event consumption, offset management, schema-on-read patterns
- AEP / Data Distiller – dataset querying, XDM schema familiarity, profile and event dataset consumption
- Cloud data platform experience (AWS S3/Glue, Azure ADLS/Synapse, or GCP GCS/BigQuery)
- Data quality and observability tooling (Great Expectations, Monte Carlo, or equivalent)
- CI/CD for data pipelines: version control (Git), automated testing, and deployment automation
- Understanding of data governance principles: lineage, cataloging, access control, and PII/data privacy
- Familiarity with AEP, Data Distiller, or Adobe analytics data models preferred
- Exposure to MarTech, CDP, or personalization platform data flows (AJO, AEP, or equivalent) is a strong plus
Company Overview
Company H1B Sponsorship