Infrastructure Engineer

Roboflow
$165,000 - $200,000Hybrid

About The Position

As a member of our infrastructure team, you'll be at the heart of a fast-paced startup environment. Your primary focus will be on striking the right balance between rapid delivery, high reliability, and robust security. This isn't a traditional, siloed role; you'll need to wear many hats—acting as an infrastructure engineer one moment, and a developer, or even a security analyst. You will be securing, scaling, and maintaining the core infrastructure that powers our product. This includes our cloud architecture, databases, file storage, search clusters, microservices, and machine learning pipelines. You'll work closely with our product team and collaborate across the company on product, operations, and customer-facing projects, constantly context-switching to solve the next critical challenge.

Requirements

  • Production experience with Kubernetes: Building and managing containerized applications at scale.
  • Infrastructure-as-Code (IaC): Using Terraform, Helm charts, bash scripting, and Python to automate everything.
  • Scale & Site Reliability: Operating, monitoring, and scaling large-scale applications (especially in ML/AI) in AWS and/or GCP.
  • Development Skills: Proficiency in Node.js and Python, with the ability to collaborate with full-stack developers on designing and operating SaaS applications.
  • ML/Big Data Ops: Hands-on experience with the infrastructure required for machine learning at scale (GPUs, Docker, Kubernetes) and familiarity with libraries like PyTorch or Tensorflow.
  • CI/CD Automation: Experience with tools like GitHub Actions or Spacelift to build and deploy code efficiently.
  • Pragmatic Security: Awareness of security best practices for cloud operations and how they can be applied to startup environments.
  • AI-Native Engineering: Leveraging LLMs and AI tools to accelerate the development lifecycle—from writing and refactoring code to identifying security vulnerabilities and optimizing infrastructure costs.

Nice To Haves

  • Many Roboflowers have used our tools before joining. One of the best ways to stand out amongst other applicants is to write about something you have built with Roboflow or contribute to one of our open source projects.
  • Meaningful contributions to successful open source devtool and security projects.

Responsibilities

  • Running and optimizing a high-availability machine learning inference service.
  • Collaborating with customer security teams to ensure secure integration.
  • Developing creative IaC solutions to scale our platform cost-effectively.
  • Working with the engineering team to define SLOs/SLAs and participating in incident response.
  • Improving the Observability and Alerting stack and the processes built around it.
  • Diving deep into our stack to identify and act on cost-optimization opportunities.
  • Contributing code (Python, JavaScript, etc.) as part of a team designing and deploying new product features.
  • Fixing security vulnerabilities and bugs
  • Hardening our systems and processes to meet SOC 2, HIPAA, and GDPR requirements, making us audit-ready.
  • Participating in an on-call rotation to ensure platform reliability.

Benefits

  • $4000/yr Travel Stipend to travel anywhere anytime to work alongside other Roboflowers
  • $350/mo Productivity stipend to spend on things that make your work environment more productive, like high-speed internet at home or a co-working space
  • Cover up to 100% of your health insurance costs for you and your partner or family
  • Equity in the company so we are all invested in the future of computer vision
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service