DevOps, Site Reliability Engineer, Vice President

Citi•Jersey City, NJ

About The Position

Position Summary: The Vice President, Technology (DevOps/SRE) will lead the engineering and operation of CI/CD platforms, container orchestration, automation, and reliability practices for mission-critical applications. This is a hands-on role requiring deep technical competence to design and troubleshoot pipelines, harden platforms, drive upgrades/migrations, and implement security remediation while partnering with application teams and risk/compliance stakeholders to ensure stable, resilient production outcomes.

Requirements

6-10 years of experience in DevOps, Site Reliability Engineering, or Infrastructure Engineering, with demonstrated ownership of production platforms and delivery outcomes.
Hands-on administration and troubleshooting skills across Linux and Windows, including strong command-line diagnostics and log analysis.
Strong experience with Kubernetes and/or OpenShift, including Helm-based deployments and cluster troubleshooting.
Experience with automation/configuration management (Ansible and Ansible Tower/Starfleet or equivalent) and a strong bias toward eliminating manual operational work.
Demonstrated experience driving vulnerability remediation, patching, and platform hardening in partnership with security/compliance teams.
Proven ability to plan and execute platform migrations and upgrades (OS, middleware, databases), including change management, runbooks, and production readiness.
Strong communication and stakeholder management skills; able to influence engineering teams and senior leaders while remaining hands-on in critical technical work.

Nice To Haves

Demonstrated ability to build CI/CD pipelines and shared templates from scratch, including governance for onboarding new applications/teams.
Experience leading enterprise migration and upgrade programs with multiple stakeholders and tightly managed change windows.
Experience with Infrastructure-as-Code (e.g., Terraform) and policy-driven environment provisioning.
Relevant certifications such as Kubernetes Administrator (CKA/CKAD) and/or Linux SysOps/Cloud platform certifications.

Responsibilities

CI/CD ownership: Architect, implement, and operate scalable CI/CD pipelines and release workflows; define standards for build, test, security scanning, and deployment automation.
Tooling and platform engineering: Provide deep expertise across Jenkins, UDeploy, Tekton, Harness (or equivalent) including architecture, configuration, upgrades, and governance.
Incident and pipeline triage: Diagnose and remediate failed pipelines (Jenkins/UDeploy) and deployment issues quickly; drive root-cause analysis and implement preventative controls.
Hands-on systems administration: Perform command-line troubleshooting and administration across Linux and Windows; partner with infrastructure teams to resolve OS, network, and runtime issues impacting production.
Platform migrations and upgrades: Lead and execute OS (e.g., RHEL) and platform upgrade initiatives across middleware and databases; plan cutovers, rollback strategies, and production readiness.
Middleware lifecycle management: Coordinate upgrades for critical runtimes and middleware (Node.js, Python, JDK, Nginx, Tomcat); enable application migrations with minimal downtime and clear runbooks.
Elasticsearch operations: Set up, troubleshoot, upgrade, back up, and monitor Elasticsearch clusters; ensure performance, availability, and recoverability.
Security and compliance: Proactively remediate vulnerabilities and implement security fixes across platforms and pipelines in line with compliance requirements; partner with security teams on evidence and controls.
Infrastructure lifecycle processes: Manage certificate renewals, secrets rotation coordination, and virtual server migrations; ensure renewals/migrations are executed safely and on schedule.
Container platforms: Engineer and operate Kubernetes and OpenShift platforms, including deployment patterns, scaling, upgrades, and cluster-level troubleshooting.
Standardized deployments: Use Helm to package, version, and deploy applications consistently across environments; drive chart standards and reuse.
Automation and configuration management: Build and maintain automation using Ansible (including Tower/Starfleet), focusing on repeatability, auditability, and reduced manual operations.

Benefits

In addition to salary, Citi’s offerings may also include, for eligible employees, discretionary and formulaic incentive and retention awards.
Citi offers competitive employee benefits, including: medical, dental & vision coverage; 401(k); life, accident, and disability insurance; and wellness programs.
Citi also offers paid time off packages, including planned time off (vacation), unplanned time off (sick leave), and paid holidays.
For additional information regarding Citi employee benefits, please visit citibenefits.com.
Available offerings may vary by jurisdiction, job level, and date of hire.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume