Redpanda is pioneering the Agentic Data Plane (ADP) - a new category in AI infrastructure that makes it simple and secure to connect AI agents with enterprise data and systems. Built on a multi-modal data streaming engine, Redpanda empowers agentic applications that reason and act in real-time with speed, autonomy, and precision. Global leaders including Activision Blizzard, Cisco, Moody's, Texas Instruments, Vodafone and 2 of the top 5 banks in the U.S. rely on Redpanda to process hundreds of terabytes of data a day. Backed by premier venture investors Lightspeed, GV and Haystack VC, Redpanda is a diverse, people-first organization with teams distributed around the globe. About the Role: We're looking for a Staff Production Operations Engineer to drive Redpanda's reliability operations program. This role combines hands-on site reliability engineering with planning and coordination skills to ensure a world-class operations practice across a globally distributed engineering team. In this role, you'll work with the broader Engineering team, Engineering leadership, Product and Customer Success to drive operational excellence. You'll coordinate our on-call and incident lead rotations, drive blameless post-incident reviews, and own the processes that help us respond faster, learn more from outages, and systematically improve reliability. We're looking for someone who can leverage AI agents to automate the operational toil that slows teams down, building on Redpanda's own ADP platform to do it.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Senior
Education Level
No Education Listed