Senior Staff Infrastructure Engineer

Flex
4d$204,000 - $300,000Remote

About The Position

Flex is looking for an exceptional Senior Staff Infrastructure Engineer with a passion for driving impact while managing high levels of ambiguity and Getting Stuff Done. In this role, you will be the bar raiser for the Infrastructure Engineering team, a small team responsible for creating and maintaining a sustainable set of platforms that ensures the effectiveness, reliability and scalability of our systems. You'll lead and enable the team in designing, building, and maintaining our robust and scalable infrastructure for engineers, customers, AI agents and fellow employees. You'll collaborate closely with our service engineering teams and leaders to determine the direction of our platform, and get ahead of where they need to be. At Flex, we are an AI-first engineering organization. We believe that the future of infrastructure isn't just about managing resources—it’s about building the intelligent, automated systems that manage them for us. We aren't looking for "task-takers"; we are looking for domain experts who use their deep knowledge of cloud architecture and SRE principles to steer these AI tools effectively. On this team, your value is defined by your ability to combine your technical mastery with an AI-augmented workflow to deliver world-class reliability at a growth-stage pace. We are particularly interested in candidates with software engineering experience in languages like Java, Python, or TypeScript. This background will allow you to collaborate effectively with product teams, build tools and automation, and improve the developer experience across our engineering organization. You’ll have the opportunity to influence key infrastructure and architecture decisions while ensuring high reliability and smooth delivery pipelines. This remote role requires a minimum of 10 years of cloud infrastructure experience.

Requirements

  • Deep mastery and architectural-level experience in designing, building, and operating highly-scaled, resilient cloud infrastructure on AWS, with deep expertise in services like EKS, S3, RDS, API Gateway, and various NoSQL/database solutions (DocumentDB, DynamoDB).
  • Expert-level proficiency and a track record of driving Infrastructure as Code (IaC) best practices using Terraform at an organizational scale, including developing reusable modules and governance frameworks.
  • Extensive experience and leadership in architecting and managing enterprise-grade, highly-available Kubernetes (EKS) and microservice platforms, driving adoption of modern container orchestration patterns.
  • Ownership of and demonstrated expertise in defining and implementing world-class CI/CD pipelines (e.g., GitHub Actions), significantly improving deployment speed, safety, and velocity across engineering teams.
  • Proven track record of architecting and delivering internal self-service platforms and advanced developer tooling that significantly boosts organizational productivity and automates common infrastructure operations.
  • Deep understanding and hands-on experience with advanced networking concepts (e.g., mesh, service discovery, cloud VPC design, security groups) to ensure global security and high performance.
  • Exceptional technical communication and cross-functional leadership skills, with the ability to drive consensus and alignment on complex technical strategies across executive, product, and engineering teams.
  • Demonstrated experience defining and leading the implementation of a unified, end-to-end observability (metrics, logs, traces) framework using industry-leading tools (Datadog preferred), transforming operational monitoring into predictive health insights.
  • Experience coding/reading in one of the industry standard language such as Java, Python, TypeScript
  • This remote role requires a minimum of 10 years of cloud infrastructure experience.

Responsibilities

  • Define the long-term technical strategy for scalable and resilient infrastructure, guiding cross-functional teams to implement solutions that optimize for performance, resilience, and cost at an organizational level.
  • Serve as the top technical authority ensuring the entire infrastructure platform aligns with critical business objectives and sets a high bar for industry standards.
  • Own the end-to-end "build vs. buy" evaluation and decision process for all major infrastructure technology, weighing long-term cost, maintenance, scalability, and strategic business alignment.
  • Serve as a hands-on technical authority, with the ability to dive deep into any part of the technology stack to diagnose complex systemic issues and drive best-in-class engineering solutions.
  • Establish and evangelize a world-class SRE culture and practices across all engineering teams, defining SLOs/SLIs and leading initiatives to achieve target reliability goals for critical systems.
  • Own and drive significant improvements in the end-to-end developer experience, including the architecture of self-service platforms, advanced CI/CD systems, and deployment mechanisms to maximize organizational velocity.
  • Take command of major, high-severity incident responses that cross team boundaries, instituting structured post-incident review processes to extract systemic lessons and drive fundamental, long-term resilience improvements.
  • Lead the vision for hyper-automation across the infrastructure domain, building sophisticated automated pipelines and tools that reduce operational toil to near zero and enable a self-healing system.
  • Communicate complex technical strategies and decisions with executive leadership, peer organizations, and the broader engineering community, driving alignment and buy-in for the infrastructure roadmaps.

Benefits

  • Competitive medical, dental, and vision available from Day 1
  • Company equity
  • 401(k) plan with company match (our company match kicks off at the beginning of 2026)
  • Unlimited paid time off + 13 company paid holidays
  • Parental leave
  • Flex Cares Program
  • Free Flex subscription
  • Competitive compensation + company equity
  • Unlimited PTO
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service