Staff/Senior Software Engineer, Infrastructure

Recruiting From ScratchSan Francisco, NY
2dHybrid

About The Position

Our client is a category-defining AI healthcare company building cutting-edge infrastructure to power real-time clinical insights from medical conversations. Their platform leverages advanced machine learning and large-scale distributed systems to transform unstructured healthcare data into actionable intelligence for providers. With significant funding, rapid enterprise adoption, and a world-class team spanning engineering, AI research, and clinical expertise, the company is scaling aggressively. This is a rare opportunity to join a high-growth environment solving deeply meaningful problems at the intersection of healthcare, AI, and infrastructure.

Requirements

  • 9+ years of backend or infrastructure engineering experience
  • Strong experience building and scaling distributed systems in production environments
  • Deep expertise in performance optimization, system scalability, and reliability engineering
  • Proficiency in languages such as Python or TypeScript
  • Hands-on experience with cloud-native technologies (e.g., Kubernetes, GCP, Terraform)
  • Track record of improving system performance, reducing latency, and enabling scale
  • Experience working across teams to influence architecture and infrastructure decisions
  • Strong ownership mindset with the ability to operate in complex, fast-scaling environments
  • Excellent communication skills and ability to translate technical concepts across teams

Nice To Haves

  • Experience with load testing, chaos engineering, and performance benchmarking
  • Background in developer platforms, internal tooling, or platform engineering
  • Familiarity with SLOs, error budgets, and production reliability frameworks
  • Experience supporting multi-tenant or high-throughput systems
  • Prior experience in high-growth startups or scaling environments
  • Interest in AI/ML infrastructure or healthcare technology

Responsibilities

  • Design and optimize large-scale distributed systems to improve performance, reliability, and scalability
  • Build and integrate load testing and chaos engineering practices into CI/CD pipelines
  • Identify latency and performance bottlenecks using observability, profiling, and monitoring tools, and implement solutions at the code level
  • Drive architectural changes to migrate and scale applications across modern infrastructure (event-driven systems, cloud runtimes, databases)
  • Partner with engineering teams to re-architect applications for multi-tenant, high-scale environments
  • Develop internal developer tools and platform capabilities to improve engineering velocity
  • Define and implement SLOs, error budgets, and system health metrics to support reliable deployments
  • Improve incident response systems, observability, and operational excellence across teams
  • Collaborate cross-functionally and embed with teams to guide infrastructure adoption and best practices
  • Contribute to technical thought leadership through documentation, training, and potentially external community engagement

Benefits

  • Competitive salary and equity package
  • Opportunity to work on high-scale, mission-critical systems in a rapidly growing company
  • Hybrid work environment in SF or NYC
  • Work alongside top-tier engineers, researchers, and healthcare professionals
  • Significant career growth opportunity in a hyperscaling environment
  • Exposure to cutting-edge AI, cloud-native, and distributed systems challenges
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service