← all jobs

[Remote] Principal DevOps Engineer

Work from home Full-time role Hiring

Note: The job is a remote job and is open to candidates in USA. Zeta Global is an AI-Powered Marketing Cloud that utilizes advanced artificial intelligence to enhance marketing efficiency. They are seeking a Principal DevOps Engineer to transform their software deployment processes, enabling continuous integration and deployment while ensuring platform reliability and compliance with regulations.

Responsibilities

  • Design, build, and operate production-grade CI/CD pipelines enabling multiple developers on multiple teams to deploy concurrently to production, multiple times daily, with zero-downtime guarantees
  • Implement and optimize advanced deployment strategies including canary releases, blue/green deployments, rolling updates, incremental rollouts, and feature flag-gated releases via Statsig
  • Build self-service deployment tooling that empowers developers to own their release process while enforcing safety guardrails, automated rollback triggers, and automate compliance gates
  • Establish deployment observability with real-time canary analysis, automated health scoring, and progressive delivery metrics integrated with Grafana, Prometheus, and Honeycomb
  • Champion CI/CD workflows using GitLab CI/CD, Helm charts, and Terraform to ensure infrastructure and application deployments are version-controlled, auditable, and reproducible
  • Define and enforce SLOs/SLIs/SLAs across services, establishing error budgets that balance velocity with reliability
  • Lead incident response processes, including on-call rotations, runbook development, blameless postmortems, and incident command structure
  • Design and implement robust observability stacks leveraging Grafana, Prometheus, Loki, and Honeycomb for metrics, logging, tracing, and alerting at scale
  • Proactively identify and eliminate reliability risks through chaos engineering, load testing, capacity planning, and failure mode analysis
  • Reduce operational toil through automation, self-healing infrastructure patterns, and intelligent alerting to minimize mean time to detection (MTTD) and recovery (MTTR)
  • Manage and optimize AWS infrastructure spanning EC2, SQS, DynamoDB, and related services with Infrastructure as Code (Terraform) best practices
  • Design and operate Kafka-based event streaming infrastructure for high-throughput, low-latency data pipelines supporting real-time marketing and analytics workloads
  • Ensure robust networking across the platform, including DNS management, service mesh configuration, load balancing, TCP/IP optimization, routing policies, and VPC architecture
  • Manage containerization strategy using Docker, ensuring efficient image builds, vulnerability scanning, registry management, and runtime security
  • Support data infrastructure operations across Snowflake, MySQL, and other database platforms, collaborating with data engineering teams on reliability and performance
  • Embed compliance controls directly into CI/CD pipelines, ensuring automated enforcement of GDPR, CCPA, and SOC 2 requirements at every stage of the software delivery lifecycle
  • Implement audit trails, change management controls, and deployment approval workflows required by regulatory frameworks in the MarTech and AdTech domains
  • Collaborate with Security and Legal teams to ensure infrastructure and deployment processes meet global compliance obligations across all operating regions
  • Maintain awareness of evolving privacy regulations (ePrivacy, state-level US laws, international data residency requirements) and proactively adapt infrastructure accordingly
  • Serve as a technical leader and DevOps disruptor, challenging legacy processes and introducing modern practices that dramatically improve developer velocity and operational safety
  • Influence software architecture decisions to simplify and streamline operational management, advocating for patterns that are deployment-friendly, observable, and resilient by design
  • Clearly communicate complex technical strategies to engineering leadership, product stakeholders, and cross-functional teams to build alignment and drive adoption
  • Develop reference architectures, internal standards, and golden path templates that codify best practices and accelerate onboarding of new services and teams
  • Participate in on-call rotations and lead by example in incident response, demonstrating the operational discipline expected across the engineering organization

Skills

  • 10+ years of progressive experience in DevOps, SRE, Platform Engineering, or Infrastructure Engineering roles, with demonstrated impact at staff or principal level
  • Expert-level Kubernetes knowledge, including cluster administration, Helm chart authoring, custom controllers/operators, network policies, RBAC, and multi-cluster management on AWS EKS
  • Deep expertise in CI/CD pipeline architecture and advanced deployment strategies (canary, blue/green, progressive delivery, feature flag integration) at scale
  • Strong proficiency with Infrastructure as Code using Terraform, including module design, state management, and multi-environment orchestration
  • Expert knowledge of Docker containerization, including multi-stage builds, security hardening, image optimization, and container runtime management
  • Production experience with Apache Kafka, including cluster management, topic design, consumer group strategies, and operational monitoring for high-throughput streaming workloads
  • Strong networking fundamentals: DNS (Route 53, internal DNS), TCP/IP, routing, API Gateway, load balancing (ALB/NLB), service mesh, VPC peering, transit gateways, and network troubleshooting
  • Extensive AWS experience spanning EKS, EC2, SQS, DynamoDB, IAM, VPC, CloudWatch, and related services in production environments
  • Hands-on experience with observability platforms: Grafana (dashboards, alerting), Prometheus (metrics, PromQL), Loki (log aggregation), and Honeycomb (distributed tracing, BubbleUp analysis)
  • Working familiarity with multiple language stacks including Node.js, React, Python, Java, and Ruby, sufficient to understand build systems, dependency management, and runtime characteristics
  • Experience operating within regulated environments, with practical knowledge of GDPR, CCPA, SOC 2, and compliance automation in MarTech or AdTech domains
  • Proven ability to influence engineering culture, drive adoption of new practices, and communicate complex technical strategies clearly to both technical and non-technical stakeholders
  • Demonstrated experience with GitLab CI/CD pipelines, including advanced pipeline features such as parent-child pipelines, dynamic environments, and security scanning integration
  • AWS certifications: Solutions Architect Professional, DevOps Engineer Professional, or Security Specialty
  • Experience with Statsig or similar feature flag and experimentation platforms for progressive delivery and A/B testing in production
  • Background in chaos engineering tools and practices (Gremlin, Litmus, Chaos Monkey) for proactive resilience validation
  • Experience building internal developer platforms (IDPs) or platform-as-a-product organizations
  • Familiarity with FinOps practices and cloud cost optimization strategies
  • Contributions to open-source DevOps/SRE tools or active participation in the broader infrastructure community
  • Experience with service mesh technologies (Istio, Linkerd) for advanced traffic management and security

Benefits

  • Unlimited PTO
  • Excellent medical, dental, and vision coverage
  • Employee Equity
  • Employee Discounts, Virtual Wellness Classes, and Pet Insurance And more!!

Company Overview

  • Zeta offers technology and marketing services to help brands acquire, engage, and retain customers. It was founded in 2007, and is headquartered in New York, New York, USA, with a workforce of 1001-5000 employees. Its website is http://www.zetaglobal.com.
  • Company H1B Sponsorship

  • Zeta Global has a track record of offering H1B sponsorships, with 14 in 2026, 20 in 2025, 17 in 2024, 11 in 2023, 6 in 2022, 8 in 2021, 16 in 2020. Please note that this does not guarantee sponsorship for this specific role.
  • More open positions

    [Remote] Customer Success Manager - Supply

    Work from home Full-time role

    Senior Sales Enablement Manager, Tavily

    Work from home Full-time role

    West Enterprise Account Executive

    Work from home Full-time role

    Quality Assurance Specialist/Engineer

    Work from home Full-time role

    Financial Control Analyst

    Work from home Full-time role

    Industrial Account Manager

    Work from home Full-time role

    Remote Customer Service Representative – Aviation Travel Support Specialist for careerzynith

    Work from home Full-time role

    Product Engineer I - AiBLE CST - Remote within CO

    Work from home Full-time role

    Territory Manager-Remote

    Work from home Full-time role

    Corporate Regional Controller| Alpharetta, GA-REMOTE

    Work from home Full-time role

    Engineer III - Backend - MRC (Remote, IND)

    Work from home Full-time role

    [Remote] Staff Accountant, Consolidations & Reporting

    Work from home Full-time role

    Insurance Producer - Denver Metro, Colorado

    Work from home Full-time role

    [Remote] WFH Customer Service Agent (Remote / Entry Level)

    Work from home Full-time role

    Administrative Assistant

    Work from home Full-time role

    Treasury Analyst 3

    Work from home Full-time role

    [Remote] IKC Clinical Programs Analyst

    Work from home Full-time role

    Data Entry Coordinator – Accurate Records Management & Administrative Support Specialist for careerzynith

    Work from home Full-time role

    Full Time Retail Sales Specialist

    Work from home Full-time role

    Switchboard Operator & Receptionist

    Work from home Full-time role

    Remote Data Entry Specialist – High‑Precision Data Management for careerzynith – Work‑From‑Home – $26/hr – Flexible Schedule – Career Growth Opportunities

    Work from home Full-time role