ePlus Technology-posted about 1 month ago
$140,000 - $170,000/Yr
Full-time • Mid Level
California, MD
1,001-5,000 employees
Professional, Scientific, and Technical Services

ePlus is seeking a Lead Technical Architect, AI Infrastructure with strong hands-on expertise in enterprise infrastructure design, deployment, and delivery. This role will lead the implementation of next-generation data center solutions that support AI, HPC, and cloud workloads - spanning compute, networking, and storage. You will be responsible for architecting, building, and automating GPU-accelerated, virtualized, and containerized environments, ensuring performance, scalability, and operational excellence across hybrid infrastructures.

  • Architect and deliver end-to-end data center solutions - compute, storage, networking - optimized for AI/ML and high-performance workloads.
  • Design, deploy, and support NVIDIA DGX, HGX, or GPU-based systems within customer environments.
  • Implement and manage virtualization platforms (VMware ESXi, vCenter, vSAN, NSX) and hyperconverged infrastructure.
  • Build and administer containerized platforms using Kubernetes (RKE, OpenShift, EKS, AKS, GKE).
  • Integrate and automate infrastructure workflows using Ansible, Terraform, Bash, or Python.
  • Collaborate with cross-functional teams - networking, DevOps, storage, and application owners - to ensure smooth project delivery.
  • Perform infrastructure assessments, capacity planning, and high-availability design for production environments.
  • Troubleshoot and optimize system performance across compute, network, and storage layers.
  • Provide technical leadership and documentation for customer deployments and internal delivery teams.
  • 6+ years of experience in data center architecture, infrastructure delivery, or systems engineering.
  • Proven experience designing and deploying enterprise infrastructure (servers, networking, storage, virtualization).
  • Hands-on expertise with VMware virtualization technologies (ESXi, vCenter, NSX, vSAN).
  • Experience with GPU-based compute platforms (NVIDIA DGX/HGX or equivalent preferred).
  • Proficiency in Kubernetes administration across multiple distributions.
  • Strong knowledge of enterprise networking (L2/L3, VLANs, routing) and storage architectures (SAN/NAS/CSI).
  • Experience with infrastructure automation and scripting (Ansible, Terraform, Bash, Python).
  • Excellent troubleshooting, documentation, and customer-facing communication skills.
  • Ability to deliver complex projects independently, on time, and in coordination with remote teams.
  • ePlus offers a full range of medical, financial, and/or other benefits (including 401(k) eligibility, employee stock purchase program and various paid time off benefits, such as vacation, sick time, and personal leave), dependent on the position offered.
  • Details of participation in these benefit plans will be provided if an offer of employment is extended.
  • ePlus Benefits highlights can be viewed here.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service