Staff Software Engineer - Infrastructure

Rad AI

12d•Remote

About The Position

At Rad AI, we’re on a mission to transform healthcare with artificial intelligence. Founded by a radiologist, our AI-driven solutions are revolutionizing radiology—saving time, reducing burnout, and improving patient care. With one of the largest proprietary radiology report datasets in the world, our AI has helped uncover hundreds of new cancer diagnoses and reduced error rates in tens of millions of radiology reports by nearly 50%. Rad AI has secured over $140M in funding, including a recently oversubscribed Series C ($68M round) led by Transformation Capital, bringing our valuation to $528M. Our investors include Khosla Ventures, World Innovation Lab, Gradient Ventures, Cone Health Ventures, and others—all backing our mission to empower physicians with cutting-edge AI. Our latest advancements in generative AI are used by thousands of radiologists daily, supporting more than one-third of radiology groups and healthcare systems and nearly 50% of all medical imaging in the U.S. at partners including Cone Health, Jefferson Einstein Health, Geisinger, Guthrie Healthcare System, and Henry Ford Health. Recognized as one of the most promising healthcare AI companies by CB Insights and AuntMinnie, and ranked by Deloitte as the 19th fastest-growing company in North America, we are building AI-powered solutions that make a real impact. Most recently, Rad AI was named to CNBC’s Disruptor 50 list, highlighting the innovation and momentum behind our mission. If you’re ready to shape the future of healthcare, we’d love to have you on our team! The Platform Engineering organization at Rad AI builds the foundations that power all of our products—Reporting, Impressions, and Continuity—and enables product teams to ship reliably, safely, and at scale. Within Platform, the Infrastructure team owns our core cloud infrastructure, platforms, and reliability practices. We’re hiring a Staff Software Engineer - Infrastructure to help us design and operate robust, scalable systems. In this role, you’ll contribute to infrastructure architecture, reliability practices, and thoughtful improvements to our workflows. If you’re passionate about building resilient platforms and enjoy collaborating across functions, we’d love to hear from you.

Requirements

Bring 8+ years of hands-on infrastructure / platform development experience (or equivalent practical experience) in modern, cloud-native environments, with a track record of owning critical systems in production.
Have deep expertise with AWS (preferred) and/or GCP, including core networking, compute, storage, and managed services.
Are highly proficient in at least one programming/scripting language used for infrastructure work (e.g., Python or Bash) and comfortable building tooling and automation for other engineers.
Have strong experience with Kubernetes, containers (Docker), and container orchestration, and understand how to operate these systems reliably at scale.
Are comfortable with Infrastructure as Code (Terraform preferred, Pulumi, or similar) and Git-based workflows.
Possess solid Linux fundamentals and are comfortable debugging issues at the OS, networking, and application layers.
Have demonstrable experience leading complex, cross-team initiatives from design through rollout—communicating tradeoffs, aligning stakeholders, de-risking launches, and measuring impact.
Communicate clearly and empathetically with both technical and non-technical partners, and enjoy mentoring engineers at multiple levels.
Take a data-informed, pragmatic approach to decision-making—balancing ideal architecture with business needs, delivery timelines, and team capacity.

Nice To Haves

Experience in regulated environments (e.g., HIPAA) or prior work in healthcare or healthtech.
Background in platform or security engineering, especially around access control, encryption, auditability, and compliance.
Experience working closely with ML / data teams or with ML platforms (e.g., Airflow, Ray, ML pipelines, model serving stacks).
Familiarity with observability stacks (CloudWatch, New Relic, Grafana, OpenTelemetry, etc.).
Experience designing or operating internal developer platforms, SDKs, or reusable frameworks that standardize how services are built and deployed.
Prior experience at a fast-growing startup where you’ve helped scale infrastructure, processes, and teams.

Responsibilities

Influence the technical direction for infrastructure and platform capabilities that support our rapidly growing AI product suite.
Architect and evolve our cloud infrastructure (primarily on AWS) across container orchestration (Kubernetes, Elastic Container Service), serverless (e.g., Lambda), virtual machines (e.g., EC2), and data stores to support current and future products.
Work closely with Platform leadership, product engineering, data, and ML teams to design systems that are robust, observable, and compliant in a healthcare environment.
Define and drive infrastructure strategy for the Platform org—partnering with engineering leadership to align roadmaps, set standards, and sequence work for maximum business impact.
Secure networking, identity, and access patterns across environments.
Improve reliability and operational excellence by defining SLOs, SLIs, and error budgets for core platform services.
Leading and participating in blameless post-incident reviews and translating learnings into systemic improvements.
Own observability and monitoring strategy across logging, metrics, and tracing, ensuring we can detect, debug, and prevent issues efficiently.
Mentor and level up engineers across Platform and product teams—reviewing design docs, guiding architecture decisions, and modeling high standards for reliability, security, and maintainability.
Partner with security and compliance stakeholders to ensure our infrastructure and operational practices meet HIPAA and other healthcare requirements.
Advocate for and implement developer experience improvements, such as better CI/CD workflows, faster feedback loops, and tooling that reduces cognitive load for product teams.