[Remote] Senior Artificial Intelligence Engineer

Work from home Full-time role Hiring

Note: The job is a remote job and is open to candidates in USA. BlueAlly is a leading provider of IT services and solutions, helping organizations conquer IT complexity across various domains. They are seeking a Senior AI Engineer to design, build, and operate enterprise AI systems, leading workstreams independently and mentoring junior engineers while engaging with clients to deliver production AI outcomes.

Responsibilities

Lead end-to-end design, build, and operation of AI systems on AI Factory platforms (HPE PCAI, Dell AI Factory, Nutanix Enterprise AI, and adjacent ecosystem layers) across multiple client engagements
Engineer and tune LLM inference serving stacks — primary depth in vLLM with breadth across the inference ecosystem — for client latency, throughput, and cost targets
Tune inference performance through KV cache management, paged attention, batching strategies, and Dynamo-based disaggregated serving
Architect and operate MLOps pipelines covering model lifecycle, registries, deployment, rollback, and observability
Design and engineer RAG applications on top of vector databases — chunking strategies, retrieval tuning, reranking, citation handling, and context-window management
Build and tune prompt-engineering patterns at production scale — system prompts, structured output, tool and function calling
Design and maintain LLM evaluation harnesses — golden sets, regression suites, and online quality metrics
Engineer high-performance storage and networking for AI workloads — parallel filesystems, object storage tiers, and high-throughput, low-latency RDMA fabrics
Operate Kubernetes clusters underpinning AI workloads — namespaces, RBAC, resource quotas, network policies, storage classes, and ingress
Build and maintain container images, registries, and CI/CD pipelines for AI/ML services
Implement monitoring, alerting, logging, and capacity planning across the AI stack
Harden environments to meet client security and compliance requirements
Lead troubleshooting across bare metal, BIOS/firmware, OS, containers, GPUs, frameworks, and models
Engage directly with client stakeholders — technical and executive — to communicate status, root cause, options, and recommendations
Mentor and code-review work from less senior engineers; raise the technical bar of every engagement you join
Author runbooks, reference architectures, and knowledge base content; lead client knowledge transfer and enablement sessions
Participate in on-call rotation and incident response for production AI workloads
Contribute reusable patterns, tooling, and reference designs back to the practice

Skills

Experience: 7+ years of software, data, or infrastructure engineering, with 3+ years specifically working with modern AI / LLM systems
Software engineering: Production-quality Python at engineering level — testing, code review, version control fluency, and shipping code that other engineers depend on
Linux engineering: Deep production Linux experience, including system internals, performance tuning, and troubleshooting
Containers: Deep proficiency with Docker — image build, registry management, runtime tuning, and container security
Hardware fundamentals: Strong server-platform skills including CPU/GPU topologies, PCIe, BMC management, BIOS/firmware lifecycle, and physical-to-logical troubleshooting
AI Factory platforms: Hands-on experience deploying and operating one or more of HPE PCAI, Dell AI Factory, or Nutanix Enterprise AI
Inference stack — vLLM: Production experience deploying, tuning, and operating vLLM
Inference stack breadth: Working knowledge of multiple inference and model-serving frameworks beyond vLLM, with the ability to choose and tune the right tool for each workload
High-performance storage and networking: Hands-on experience with high-throughput, low-latency storage and network fabrics for AI workloads — including RDMA-class interconnects, parallel/object storage tiers, KV cache management, and Dynamo-style disaggregated serving
MLOps: Practical experience operating MLOps tooling and patterns — model registries, deployment pipelines, GitOps, lineage, and rollback
Vector databases and RAG: Hands-on experience deploying, tuning, and integrating vector databases and RAG pipelines, including the application-level engineering that sits on top of them
Prompt engineering and tool use: Production experience designing system prompts, structured output, function calling, and tool-using LLM patterns
Evaluation methodology: Demonstrated experience designing LLM evaluation harnesses — golden sets, regression suites, and quality/cost metrics
Client-facing skills: Demonstrated ability to engage directly with client stakeholders — running working sessions, presenting recommendations, and translating technical detail for non-technical audiences
Communication: Strong written and verbal communication — clear reference architectures, runbooks, and incident reports
Mentorship: Track record of mentoring more junior engineers and raising team technical quality through code review and pairing
Networking fundamentals: TCP/IP, DNS, load balancing, VLANs, and firewall administration
Multi-client delivery: Comfort working across multiple concurrent client environments and managing competing priorities under SLA
GPU operations: Experience with GPU drivers, CUDA toolchains, GPU partitioning (MIG/vGPU), and GPU-level monitoring
NVIDIA AI Enterprise: Deployment and operations experience with the NVAIE software stack
Ray: Familiarity with Ray for distributed training and inference scaling
Kubernetes: Working knowledge of Kubernetes administration — Helm, ingress, RBAC, storage classes
Identity and access: Integrating SSO and enterprise identity (LDAP, AD, OIDC/SAML), secrets management, tenant isolation
Fine-tuning: Familiarity with LoRA/QLoRA/PEFT and supervised fine-tuning workflows
Token economics: Experience optimizing inference cost — caching, prompt caching, model routing, and distillation
MSP / multi-tenant operations: Service-provider experience including chargeback/showback and tenant isolation patterns
Compliance frameworks: SOC 2, HIPAA, FedRAMP, FISMA, or CMMC environments
Public cloud and hybrid: Working experience with one or more public clouds and hybrid architectures
Infrastructure as Code: Terraform, Ansible, Helm, or similar

Company Overview

BlueAlly has been serving as a prime source of IT Services for customers both large and small. It is a sub-organization of SecureWirelessWorks.com. It was founded in 1999, and is headquartered in Vienna, Virginia, USA, with a workforce of 201-500 employees. Its website is https://blueally.com/.

Company H1B Sponsorship

BlueAlly has a track record of offering H1B sponsorships, with 7 in 2020. Please note that this does not guarantee sponsorship for this specific role.

Apply Now

[Remote] Senior Artificial Intelligence Engineer

More open positions

[Remote] Enterprise Account Executive

[Remote] Director of Customer Success - (Enterprise) - 1034

[Remote] ERP Consultant

[Remote] Customer Technical Support Manager

[Remote] AJC Freight Solutions: Senior Account Executive(Remote or Atlanta)

Recruiter Part Time High Volume Hiring

Lead, Technical Accounting

Remote Title Examiner

Experienced Full Stack Data Analyst – Clinical Research and Evidence Generation

[Remote] Business Development Manager, People Operations

Entry-Level Remote Data Entry Specialist – Full‑Time Work‑From‑Home Position with Comprehensive IT Training at careerzynith

Remote Entry-Level Customer Chat Support Specialist – Home‑Based Service & Technical Assistance – $25‑$35/hr

Manager, Social Media Strategy & Community

Remote Customer Service Representative – Aviation Passenger Support & Booking Specialist at careerzynith

Entry-Level Remote Data Entry Specialist – Part‑Time Flexible Schedule with careerzynith – Launch Your Career from Home

Bookkeeper / Staff Accountant

Remote Lead Aircraft Structures Engineer

Automation Tester Contractor

Medical Scientific Liaison - Specialty

Clinical Affairs Specialist

Online Transcription Jobs for Beginners