AI DevOps Engineer

Alignity SolutionsNew York, NY
Hybrid

About The Position

Our client is seeking a highly skilled AI DevOps Engineer to design, build, and operate scalable, secure, and production-grade infrastructure supporting modern AI platforms and LLM-powered applications. This role sits at the intersection of DevOps, Platform Engineering, Site Reliability Engineering (SRE), and AI Infrastructure, enabling high-performance AI systems, agent-based workflows, and enterprise AI platforms within a regulated financial services environment. The ideal candidate will have strong expertise in Kubernetes, Terraform, cloud infrastructure, automation, and AI platform operations, along with experience supporting modern AI/LLM workloads in production environments.

Requirements

  • Bachelor’s degree in Computer Science, Engineering, or equivalent practical experience.
  • Proven experience as a DevOps Engineer, Platform Engineer, or Site Reliability Engineer (SRE).
  • Strong hands-on experience managing large-scale production infrastructure.
  • Expertise with Terraform and Infrastructure-as-Code (IaC) methodologies.
  • Strong experience deploying and operating Kubernetes-based environments.
  • Experience supporting infrastructure for AI platforms or LLM-based applications.
  • Strong understanding of automation, scalability, reliability, and cloud-native architectures.

Nice To Haves

  • Experience supporting production-grade LLM applications and AI agent workloads.
  • Hands-on experience with vector databases such as Pinecone, Weaviate, or pgvector.
  • Experience building or supporting AI tooling and internal AI developer platforms.
  • Knowledge of observability, monitoring, capacity planning, and reliability engineering for AI/ML systems.
  • Experience working within financial services or other highly regulated industries.
  • Strong communication and cross-functional collaboration skills.

Responsibilities

  • Design, deploy, and manage scalable infrastructure for AI and LLM-based applications in production environments.
  • Build and maintain Infrastructure-as-Code (IaC) using tools such as Terraform for secure, repeatable, and auditable deployments.
  • Deploy, manage, and scale containerized environments using Kubernetes with a focus on high availability and reliability.
  • Implement DevOps, Platform Engineering, and SRE best practices to improve system reliability, scalability, and operational efficiency.
  • Support AI platform services for model serving, inference, experimentation, and evaluation workflows.
  • Deploy and maintain infrastructure supporting AI agents, orchestration frameworks, and LLM runtime dependencies.
  • Design and manage vector database infrastructure including Pinecone, Weaviate, or PostgreSQL with pgvector for RAG and semantic search use cases.
  • Enable AI developer platforms and tooling for engineering teams building AI-powered applications.
  • Implement monitoring, alerting, logging, and incident response processes for mission-critical AI systems.
  • Collaborate with security, compliance, and governance teams to ensure adherence to regulatory and enterprise security standards.
  • Continuously improve automation, developer experience, and operational processes for AI infrastructure environments.

Benefits

  • Visit us at http://alignity.io/careers. Alignity Solutions is an Equal Opportunity Employer, M/F/V/D.
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service