[Remote] Data Engineer (Databricks)
Note: The job is a remote job and is open to candidates in USA. Livefront is a company that helps businesses design and build digital products. They are seeking a Databricks Data Engineer to build data and AI foundations for digital products and intelligent experiences.
Responsibilities
- Design and build production data pipelines using Lakeflow Declarative Pipelines, Autoloader, and Structured Streaming, with end-to-end ownership of ingestion, transformation, data quality expectations, and CI/CD deployment via Declarative Automation Bundles
- Architect and implement Lakehouse solutions on Databricks — medallion architecture, Delta Lake, Unity Catalog — tailored to the client's analytics, AI, and application needs
- Build and maintain Databricks transformation layers — DLT pipelines, PySpark notebooks, and dbt — with data quality constraints and SLAs baked in
- Design and maintain the data and AI foundations — Unity Catalog, Feature Store, MLflow, and Model Serving — that power production ML, agent workflows, and AI-enabled digital products
- Collaborate with product and backend engineers to design data models, APIs, and application data contracts — ensuring the platform serves the product, not just the warehouse
- Consult with clients to understand their data challenges, develop data strategies, and implement sustainable solutions
- Adapt your approach based on project needs — sometimes leading data architecture discussions with clients, other times supporting internal teams with specialized data expertise
- Work within multi-cloud environments — primarily AWS and Azure — anchoring data platform recommendations around Databricks where it fits the client's architecture and goals
- Champion data governance through Unity Catalog — access control, lineage, data quality policies, and compliance — as a first-class part of every engagement, not an afterthought
- Design data-to-application architectures — including Lakebase-backed services and Databricks Apps — that connect governed data to AI workflows, digital products, and user-facing experiences
- Help build Livefront's Databricks practice — contributing to accelerators, internal enablement, certification goals, and Databricks partner go-to-market materials alongside delivery work
Skills
- 3-5 years of data engineering experience with at least 2 years in production Databricks environments, preferably in a consulting or client delivery context
- Solid working knowledge of AWS and Azure cloud services relevant to Databricks deployments — storage, networking, IAM, and compute — with GCP familiarity a plus
- Deep, production-grade Databricks expertise: Lakeflow Declarative Pipelines, Autoloader, Structured Streaming, Lakeflow Jobs, Unity Catalog (including fine-grained access control and lineage) — demonstrated through shipped production workloads, not prototypes
- Proven experience designing Lakehouse architectures — medallion patterns, Delta Lake table design, partitioning, Z-ordering, and query optimization — at production scale
- Hands-on experience with data pipeline testing, observability, and CI/CD for data — including unit testing, data quality frameworks, and version-controlled deployments via Git and Declarative Automation Bundles
- Strong proficiency in SQL and Python, with the ability to write clean, performant, and maintainable code
- Understanding of data modeling, schema design, and query optimization
- Excellent communication skills with the ability to explain complex data concepts to both technical and non-technical stakeholders
- Strong problem-solving skills with the ability to navigate ambiguous requirements and deliver pragmatic solutions
- Above-average discipline and personal organization skills
- Obvious comfort with critique and peer review in the context of an iterative development process
- A demonstrated hunger for personal and professional growth
- A self-evident love and care for the craft of data engineering
- Have worked with real-time streaming technologies (Kafka, Kinesis, etc.)
- Have hands-on experience with alternative cloud data platforms — useful context for migrations and competitive assessments, though Databricks is our primary platform focus
- Have experience in healthcare or fintech domains
- Have hands-on experience with MLOps or LLMOps on Databricks — MLflow experiment tracking, model registry, Model Serving endpoints, or Vector Search for RAG pipelines
- Have experience with Java, Go, or Scala
- Have strong illustration skills for technical diagramming and data architecture documentation
- Speak, write, and/or educate publicly about data engineering topics
- Have contributed to open-source data projects
- Hold or are actively pursuing a Databricks certification (Data Engineer Associate or Professional, or Apache Spark Developer) — we treat these as meaningful signals of platform depth, and they directly support our Databricks partner growth goals
- Have experience with Databricks Apps, or Lakebase — early familiarity with where the Databricks platform is heading is a strong differentiator
Company Overview
Company H1B Sponsorship