Senior AI/ML Specialist Solutions Architect (AI Infra & Cloud)

LavendoSan Francisco, CA
76d$180,000 - $300,000

About The Position

We are seeking a Senior AI/ML Specialist Solutions Architect to join our client's team. This role offers the chance to design and implement scalable AI solutions for AI-focused customers, working with state-of-the-art technologies and contributing to one of the most powerful commercially available supercomputers.

Requirements

  • 5+ years of experience with cloud technologies and infrastructure, ideally in senior MLOps or Solutions Architect roles
  • Proven expertise in scaling and optimizing AI workloads across multi-node and multi-GPU environments
  • Demonstrated success delivering ML products, scaling from POC to production
  • Deep knowledge of ML frameworks like PyTorch and JAX
  • Strong background in the NVIDIA HPC ecosystem (CUDA, NCCL, Infiniband)
  • Exceptional communication skills to engage both technical teams and business stakeholders
  • Legal authorization to work in the United States on a full-time basis without sponsorship

Nice To Haves

  • Programming Languages: Python, Go, Java, C++
  • Infrastructure as Code (IaC): Terraform, Ansible
  • Orchestration: Kubernetes (K8s), Slurm
  • DevOps Tools: Git, Docker, Helm
  • Big Data Frameworks: Spark, Kafka, Hadoop
  • Databases: SQL, NoSQL, and vector databases
  • ML Frameworks: PyTorch, TensorFlow, JAX, HuggingFace, Scikit-learn

Responsibilities

  • Architect and optimize distributed training and inference systems for large-scale AI models
  • Design and deliver customer-focused solutions that maximize performance and business value
  • Lead the transition of ML pipelines from POC to scalable production systems
  • Build long-term customer relationships, ensuring satisfaction and alignment with strategic goals
  • Create whitepapers, deliver technical presentations, and host webinars to share insights and best practices
  • Provide technical leadership and mentor teams on AI infrastructure and deployment strategies
  • Collaborate with engineering and product teams to prioritize customer feedback and influence product roadmaps

Benefits

  • Competitive compensation: $180,000 - $300,000 per year (negotiable based on experience and location)
  • Full medical benefits: 100% company-paid medical, dental, and vision coverage for employees and families
  • 401(k) plan with a 4% match program
  • Stock options plan
  • Flexible remote work environment
  • Company-paid short-term, long-term disability, and life insurance coverage
  • 20 weeks paid parental leave for primary caregivers, 12 weeks for secondary caregivers
  • Up to $85/month for mobile and internet
  • Work with state-of-the-art AI and cloud technologies, including the latest NVIDIA GPUs
  • Be part of a team that operates one of the most powerful commercially available supercomputers
  • Contribute to sustainable AI infrastructure, with energy-efficient data centers that recover waste heat to warm nearby residential buildings
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service