The application window is expected to close on: 06/15/2026. Job posting may be removed earlier if the position is filled or if a sufficient number of applications are received. Meet the Team We are an agile team inside Cisco IT, building the next generation NoSQL and Vector Databases on cloud platforms that will be demonstrated by all of Cisco as we move to cloud native applications. This is a small team of highly motivated individuals demonstrating Agile scrum methodology. Our team is responsible for building and operating Hybrid Cloud Database services in a DevSecOps model. We move at a fast pace and are passionate about cloud and automation. We have a history of building clouds at a large scale and are looking for someone who as passionate about it as we are. Your Impact In this role, you will ensure production services are scalable, resilient, high-performing, and secure. You’ll support uptime through an On-Call rotation, monitoring, and alerting to meet SLOs and SLAs. Reliability is strengthened by conducting Disaster Recovery drills and managing incidents—investigating root causes, applying remediation, and driving continuous improvement. You’ll define reliability and security requirements for systems and components to meet company, customer, and regulatory objectives. Operational efficiency is enhanced by automating repetitive tasks and mitigating failure points. You’ll also develop tools and techniques for early detection of issues in products, packaging, processes, and product reliability. Serves as an experienced professional resource, independently applying best practices and business knowledge to improve products or services while guiding and supporting less experienced colleagues. Understands project and/or department needs and establishes relationships with appropriate cross-functional partners to gather input, collect information, and complete work steps. Designs and deploys small to mid-size or moderately complex solutions to optimize reliability, availability, latency, and performance. Builds automated platforms and applies design, deployment, and coding expertise to enhance reliability, scalability, and velocity; designs and tests high availability and disaster recovery measures across regions and customers. Forecasts and builds reports to determine at what point resources will be at capacity. Designs and implements tools to monitor and provide transparency into the performance and reliability of our infrastructure; collaborates with Developers and Ops to identify issues, serves as on-call SRE, and leads post mortems and root cause analyses. Builds and ensures security controls are in place in architectural design, collaborates with security in designing or reviewing security controls, and may actively contribute in security incident response.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Senior