About The Position

The Staff Infrastructure Reliability Engineer is responsible for the technical leadership of Redfin’s production database and storage systems. They will work with the database team manager and other database and storage engineers to develop and execute a strategy that supports system reliability and maintainability. They will help lead the team’s strategy as we expand the database and storage technologies available to Redfin engineers. The Staff Infrastructure Reliability Engineer will also collaborate with senior engineers and leaders across Redfin and partner companies to solve large-scale problems. This role requires depth in design, collaboration with internal teams, a proactive approach to problem solving, and the ability to share complex ideas with senior leadership and secure their support. About the Role You will help lead the database and storage strategy at Redfin, including architecture, management, and access patterns. You will lead complex technical discussions with a variety of audiences, including software and systems engineers and business leaders. You will architect & lead implementation of cloud database and storage systems with a focus on reliability, observability, scalability, and security. You will support large scale / high volume databases both as self-managed and specialized AWS managed offerings, including management activities, such as upgrade, backup, recovery, and migration. You will use and evangelize approved AI code generation tools to document, architect, and create code. You will plan and participate in high availability and disaster recovery planning/drills. You will lead incident resolution, including performing root causes analyses. You will use your systems knowledge to promote scaling and performance for services across Redfin and some partner companies. You will participate in an on-call rotation for about one week per month.

Requirements

  • 7+ years of experience managing systems in AWS or a similar cloud environment, including compute and storage with an emphasis on solution development and execution.
  • 5+ years of experience with at least one, but preferably more, of the following: PostgreSQL or similar RDBMS; AWS Aurora/RDS; AWS S3; Elasticache; Opensearch; DynamoDB.
  • Proven history in architecting, building, scaling, and supporting cloud infrastructure technologies, specializing in database and storage services and can communicate the direct business impact of this work.
  • Extensive experience with Linux administration and Linux scripting, including Python script development.
  • Experienced mentor of other engineers with the ability to guide a team of engineers to identify and implement solutions to difficult problems.
  • Committed to best practices that set your team up for long-term success, including infrastructure as code, configuration management tooling, and security practices.
  • Deep knowledge and professional use of at least one AI code generation tool, such as Anthropic Claude Code, GitHub CoPilot, Cursor, or similar to implement key efficiencies for cloud infrastructure.
  • Excellent communication skills that allow you to connect and influence your immediate team up through senior leadership.
  • Understand and can implement core reliability principles, including monitoring, alerting, and incident management.

Responsibilities

  • Help lead the database and storage strategy at Redfin, including architecture, management, and access patterns.
  • Lead complex technical discussions with a variety of audiences, including software and systems engineers and business leaders.
  • Architect & lead implementation of cloud database and storage systems with a focus on reliability, observability, scalability, and security.
  • Support large scale / high volume databases both as self-managed and specialized AWS managed offerings, including management activities, such as upgrade, backup, recovery, and migration.
  • Use and evangelize approved AI code generation tools to document, architect, and create code.
  • Plan and participate in high availability and disaster recovery planning/drills.
  • Lead incident resolution, including performing root causes analyses.
  • Use systems knowledge to promote scaling and performance for services across Redfin and some partner companies.
  • Participate in an on-call rotation for about one week per month.

Benefits

  • medical, dental, and vision benefits
  • 401K retirement plan
  • paid-time off
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service