[Remote] Software Engineer II - Cloud Infrastructure Engineer
Note: The job is a remote job and is open to candidates in USA. Abnormal AI is an AI-native behavioral security platform that protects enterprises from advanced threats. They are seeking a Cloud Infrastructure Engineer to manage the full lifecycle of their cell-based deployment architecture, ensuring a scalable and reliable infrastructure as they expand into new product lines.
Responsibilities
- Bootstrap new cells end-to-end: full infrastructure setup (compute, networking, IAM, etc.) and complete application stack deployment
- Maintain and evolve cell lifecycle tooling to make provisioning repeatable, auditable, and operator-friendly—reducing manual steps and time-to-production
- Partner with application and product teams to design and implement scalable, cell-native architecture approaches
- Design, build, test, scale, monitor, and maintain secure, cost-efficient infrastructure in a multi-cloud environment (AWS and Azure)
- Triage and resolve complex cross-layer issues quickly, then drive root cause fixes that prevent recurrence
- Drive down technical debt and toil through automation and systemic improvements to the cell deployment lifecycle
- Participate in on-call rotation with a learning-oriented mindset, identifying systemic gaps and driving long-term reliability improvements
- Keep cross-team communication low-friction and high-signal: proactive and well-contextualized
- Contribute as a core member of an agile team through sprint planning, standups, and execution with a strong sense of ownership and teamwork
Skills
- Bachelor's degree in Computer Science or a related technical field
- 4+ years of experience engineering cloud infrastructure for production microservice systems, with attention to performance, reliability, security, and cost
- 2+ years of Python experience, including application-layer code (not just scripts)
- 1+ year of experience with Kubernetes and Helm
- 1+ year of AWS experience ( VPC, IAM, S3, Route 53, CloudFront, EKS, ECS, CloudWatch)
- 1+ year of Terraform and HCL experience
- Comfort operating across infra and application engineering without hard boundaries
- Experience with on-call rotations, incident response, and operating production-grade systems
- Practical experience using Generative AI tools in day-to-day engineering workflows
- Strong communication skills and the ability to thrive in a fast-paced, remote-first environment—balancing autonomy with collaboration, demonstrating a bias toward action, and maintaining a positive, constructive mindset
- Experience with Bash, Golang, Terragrunt and data infrastructure (Spark, Databricks)
- Hands-on experience with cell-based, multi-tenant, or multi-region infrastructure architectures
- Familiarity with Generative AI developer tools such as Claude Code, and experience driving AI-first engineering workflows
- Prior experience building large-scale IaC abstractions or internal developer platforms
- AWS certifications
Benefits
- Bonus or incentive compensation
- Equity
- A comprehensive benefits package
Company Overview
Company H1B Sponsorship