We are seeking a highly skilled and forward‑thinking Lead Systems Operations Engineer to join our Technology Operations team. This role is ideal for someone who excels in Kubernetes and OpenShift platform operations, drives operational excellence, and leads initiatives that improve stability, automation, and service reliability. You will play a key role in operating and improving our cloud‑native platforms, reducing operational toil, and ensuring the resilience and compliance of critical infrastructure services. In this role, you will: Lead complex, broad impact initiatives including provision of high-level systems consultation for the technology teams Platform Operations Leadership: Lead day‑to‑day Platform (REDIS, OpenShift) platform operations, including cluster maintenance, upgrades, performance monitoring, and troubleshooting Operations Excellence – Improving operations practices to meet new Incident SLA and improving practices during incident & problem management Incident Response & Problem Management: Serve as an operational lead during incidents, driving rapid diagnosis, resolution, root‑cause analysis, and long‑term corrective actions Operational Automation: Develop or enhance automation (Python, Bash, GitOps workflows, or AI‑assisted tools), build AI Agents, MCP server and tools, add skill in MCP that eliminates manual effort and streamlines run processes Platform Readiness: Lead Platform lifecycle activities, including new cluster builds, configuration, onboarding, upgrades, and cluster decommissioning, ensuring consistency, reliability, and compliance across environments Collaboration & Enablement: Partner with engineering, SRE, security, and development teams to implement repeatable operational patterns, guardrails, and platform readiness standards Security, Compliance & Governance: Ensure platform operations follow organizational policies, security standards, audit controls, and regulatory requirements Continuous Improvement: Identify operational gaps, recurring issues, or inefficiencies and lead initiatives to enhance reliability, resiliency, and operational maturity.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Senior
Education Level
No Education Listed