Cloud Engineering - Cloud / Platform Engineer

Virtue AI•San Francisco, CA

48d

About The Position

Virtue AI sets the standard for advanced AI security platforms. Built on decades of foundational and award-winning research in AI security, its AI-native architecture unifies automated red-teaming, real-time multimodal guardrails, and systematic governance for enterprise apps and agents. Deploy in minutes—across any environment—to keep your AI protected and compliant. We are a well-funded, early-stage startup founded by industry veterans, and we're looking for passionate builders to join our core team. As a Cloud / Platform Engineer, you will own how Virtue AI is deployed, scaled, and operated in customer environments. Your work directly enables one-click bring-up of our platform on AWS and GCP, from PoC to production. You’re opinionated about production infrastructure, hate snowflake deployments, and believe customers should be able to deploy complex AI systems without talking to an SRE for two weeks.

Requirements

Bachelor’s degree or higher in CS, CE, EE, or related field
Strong experience deploying production systems on AWS and/or GCP
Deep hands-on experience with: Docker and containerized workloads Terraform or Pulumi Kubernetes (EKS, GKE, or equivalent)
Experience designing secure cloud architectures (IAM, VPCs, private networking)
Experience deploying GPU workloads in the cloud
Strong scripting skills (Python, Bash, or Go)
Ability to own infrastructure end-to-end and support real customers
Experience in packaging products for: AWS Marketplace GCP Marketplace Helm-based enterprise installs

Nice To Haves

Experience deploying ML / LLM inference systems at scale
Familiarity with vLLM, sglang, Triton, or similar inference stacks
Experience with hybrid or on-prem deployments
Startup experience: you move fast, document what matters, and fix things properly

Responsibilities

Design and maintain one-click deployment workflows for Virtue AI on AWS and GCP
Build and operate production-grade cloud infrastructure for AI inference, agents, and guardrails
Own IaC (Terraform / Pulumi) for repeatable, auditable customer deployments
Package our services into secure, customer-ready deployment units (Docker, Helm, Marketplace images)
Enable GPU-backed inference (H100/A100/L4, etc.) with correct autoscaling, scheduling, and cost controls
Implement secure networking (VPCs, IAM, service accounts, private endpoints, firewalling)
Collaborate with backend, ML, and research teams to align infrastructure with model behavior, evals, and throughput needs
Debug and unblock customer deployments across cloud, hybrid, and restricted environments
Improve reliability, observability, and rollout safety (logging, metrics, health checks, rollback)

Benefits

Competitive base salary compensation + equity commensurate with skills and experience.
Impact at scale – Help define the category of AI security and partner with Fortune 500 enterprises on their most strategic AI initiatives.
Work on the frontier – Engage with bleeding-edge AI/ML and deploy AI security solutions for use cases that don't yet exist anywhere else yet.
Collaborative culture – Join a team of builders, problem-solvers, and innovators who are mission-driven and collaborative.
Opportunity for growth – Shape not only our customer engagements, but also the processes and culture of an early lean team with plans for scale.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume