← all jobs

[Remote] Senior Artificial Intelligence Engineer

Work from home Full-time role Hiring

Note: The job is a remote job and is open to candidates in USA. BlueAlly is a leading provider of IT services and solutions, helping organizations conquer IT complexity across various domains. They are seeking a Senior AI Engineer to design, build, and operate enterprise AI systems, leading workstreams independently and mentoring junior engineers while engaging with clients to deliver production AI outcomes.

Responsibilities

  • Lead end-to-end design, build, and operation of AI systems on AI Factory platforms (HPE PCAI, Dell AI Factory, Nutanix Enterprise AI, and adjacent ecosystem layers) across multiple client engagements
  • Engineer and tune LLM inference serving stacks — primary depth in vLLM with breadth across the inference ecosystem — for client latency, throughput, and cost targets
  • Tune inference performance through KV cache management, paged attention, batching strategies, and Dynamo-based disaggregated serving
  • Architect and operate MLOps pipelines covering model lifecycle, registries, deployment, rollback, and observability
  • Design and engineer RAG applications on top of vector databases — chunking strategies, retrieval tuning, reranking, citation handling, and context-window management
  • Build and tune prompt-engineering patterns at production scale — system prompts, structured output, tool and function calling
  • Design and maintain LLM evaluation harnesses — golden sets, regression suites, and online quality metrics
  • Engineer high-performance storage and networking for AI workloads — parallel filesystems, object storage tiers, and high-throughput, low-latency RDMA fabrics
  • Operate Kubernetes clusters underpinning AI workloads — namespaces, RBAC, resource quotas, network policies, storage classes, and ingress
  • Build and maintain container images, registries, and CI/CD pipelines for AI/ML services
  • Implement monitoring, alerting, logging, and capacity planning across the AI stack
  • Harden environments to meet client security and compliance requirements
  • Lead troubleshooting across bare metal, BIOS/firmware, OS, containers, GPUs, frameworks, and models
  • Engage directly with client stakeholders — technical and executive — to communicate status, root cause, options, and recommendations
  • Mentor and code-review work from less senior engineers; raise the technical bar of every engagement you join
  • Author runbooks, reference architectures, and knowledge base content; lead client knowledge transfer and enablement sessions
  • Participate in on-call rotation and incident response for production AI workloads
  • Contribute reusable patterns, tooling, and reference designs back to the practice

Skills

  • Experience: 7+ years of software, data, or infrastructure engineering, with 3+ years specifically working with modern AI / LLM systems
  • Software engineering: Production-quality Python at engineering level — testing, code review, version control fluency, and shipping code that other engineers depend on
  • Linux engineering: Deep production Linux experience, including system internals, performance tuning, and troubleshooting
  • Containers: Deep proficiency with Docker — image build, registry management, runtime tuning, and container security
  • Hardware fundamentals: Strong server-platform skills including CPU/GPU topologies, PCIe, BMC management, BIOS/firmware lifecycle, and physical-to-logical troubleshooting
  • AI Factory platforms: Hands-on experience deploying and operating one or more of HPE PCAI, Dell AI Factory, or Nutanix Enterprise AI
  • Inference stack — vLLM: Production experience deploying, tuning, and operating vLLM
  • Inference stack breadth: Working knowledge of multiple inference and model-serving frameworks beyond vLLM, with the ability to choose and tune the right tool for each workload
  • High-performance storage and networking: Hands-on experience with high-throughput, low-latency storage and network fabrics for AI workloads — including RDMA-class interconnects, parallel/object storage tiers, KV cache management, and Dynamo-style disaggregated serving
  • MLOps: Practical experience operating MLOps tooling and patterns — model registries, deployment pipelines, GitOps, lineage, and rollback
  • Vector databases and RAG: Hands-on experience deploying, tuning, and integrating vector databases and RAG pipelines, including the application-level engineering that sits on top of them
  • Prompt engineering and tool use: Production experience designing system prompts, structured output, function calling, and tool-using LLM patterns
  • Evaluation methodology: Demonstrated experience designing LLM evaluation harnesses — golden sets, regression suites, and quality/cost metrics
  • Client-facing skills: Demonstrated ability to engage directly with client stakeholders — running working sessions, presenting recommendations, and translating technical detail for non-technical audiences
  • Communication: Strong written and verbal communication — clear reference architectures, runbooks, and incident reports
  • Mentorship: Track record of mentoring more junior engineers and raising team technical quality through code review and pairing
  • Networking fundamentals: TCP/IP, DNS, load balancing, VLANs, and firewall administration
  • Multi-client delivery: Comfort working across multiple concurrent client environments and managing competing priorities under SLA
  • GPU operations: Experience with GPU drivers, CUDA toolchains, GPU partitioning (MIG/vGPU), and GPU-level monitoring
  • NVIDIA AI Enterprise: Deployment and operations experience with the NVAIE software stack
  • Ray: Familiarity with Ray for distributed training and inference scaling
  • Kubernetes: Working knowledge of Kubernetes administration — Helm, ingress, RBAC, storage classes
  • Identity and access: Integrating SSO and enterprise identity (LDAP, AD, OIDC/SAML), secrets management, tenant isolation
  • Fine-tuning: Familiarity with LoRA/QLoRA/PEFT and supervised fine-tuning workflows
  • Token economics: Experience optimizing inference cost — caching, prompt caching, model routing, and distillation
  • MSP / multi-tenant operations: Service-provider experience including chargeback/showback and tenant isolation patterns
  • Compliance frameworks: SOC 2, HIPAA, FedRAMP, FISMA, or CMMC environments
  • Public cloud and hybrid: Working experience with one or more public clouds and hybrid architectures
  • Infrastructure as Code: Terraform, Ansible, Helm, or similar

Company Overview

  • BlueAlly has been serving as a prime source of IT Services for customers both large and small. It is a sub-organization of SecureWirelessWorks.com. It was founded in 1999, and is headquartered in Vienna, Virginia, USA, with a workforce of 201-500 employees. Its website is https://blueally.com/.
  • Company H1B Sponsorship

  • BlueAlly has a track record of offering H1B sponsorships, with 7 in 2020. Please note that this does not guarantee sponsorship for this specific role.
  • More open positions

    [Remote] Enterprise Account Executive

    Work from home Full-time role

    [Remote] Director of Customer Success - (Enterprise) - 1034

    Work from home Full-time role

    [Remote] ERP Consultant

    Work from home Full-time role

    [Remote] Customer Technical Support Manager

    Work from home Full-time role

    [Remote] AJC Freight Solutions: Senior Account Executive(Remote or Atlanta)

    Work from home Full-time role

    Recruiter Part Time High Volume Hiring

    Work from home Full-time role

    Lead, Technical Accounting

    Work from home Full-time role

    Remote Title Examiner

    Work from home Full-time role

    Experienced Full Stack Data Analyst – Clinical Research and Evidence Generation

    Work from home Full-time role

    [Remote] Business Development Manager, People Operations

    Work from home Full-time role

    Entry-Level Remote Data Entry Specialist – Full‑Time Work‑From‑Home Position with Comprehensive IT Training at careerzynith

    Work from home Full-time role

    Remote Entry-Level Customer Chat Support Specialist – Home‑Based Service & Technical Assistance – $25‑$35/hr

    Work from home Full-time role

    Manager, Social Media Strategy & Community

    Work from home Full-time role

    Remote Customer Service Representative – Aviation Passenger Support & Booking Specialist at careerzynith

    Work from home Full-time role

    Entry-Level Remote Data Entry Specialist – Part‑Time Flexible Schedule with careerzynith – Launch Your Career from Home

    Work from home Full-time role

    Bookkeeper / Staff Accountant

    Work from home Full-time role

    Remote Lead Aircraft Structures Engineer

    Work from home Full-time role

    Automation Tester Contractor

    Work from home Full-time role

    Medical Scientific Liaison - Specialty

    Work from home Full-time role

    Clinical Affairs Specialist

    Work from home Full-time role

    Online Transcription Jobs for Beginners

    Work from home Full-time role