DevOps Engineer (Onsite)

CognizantBridgewater, NJ
26dOnsite

About The Position

As a DevOps Engineer you will make an impact by administering monitor and maintain production-grade Kubernetes clusters deployed in on-prem datacentre. You will be a valued member of the team and work closely with the Retail team. In this role, you will: Perform cluster lifecycle operations including upgrades patching node provisioning and capacity planning. Implement and manage RBAC network policies Pod-Security Policies and Namespaces for multi-tenant environments. Maintain Ingress controllers service meshes and API gateways. Troubleshoot cluster-level issues including node failures pod scheduling and resource bottlenecks. Containerization & Orchestration Build and manage Docker images Compose files and private registries. Deploy and orchestrate microservices using Kubernetes Helm charts and Red Hat OpenShift. Optimize container resource usage autoscaling policies and affinity/anti-affinity rules. CI/CD Integration Design and maintain CI/CD pipelines using DevOps tools Automate deployment of containerized AI applications into Kubernetes clusters. Develop reusable pipeline templates and scripts for rapid onboarding and POC delivery. AI Workflow Enablement (ClearML) Integrate ClearML for experiment tracking model versioning and pipeline orchestration. Collaborate with AI/ML teams to deploy containerized models and automate GPU job scheduling. Build custom ClearML agents and workflows for reproducible experimentation and deployment. Scripting & Tooling Develop automation scripts in Shell Python Build internal tools to streamline cluster operations and observability. Work model: At Cognizant, we strive to provide flexibility wherever possible, and we are here to support a healthy work-life balance though our various wellbeing programs. Based on this role’s business requirements, this is an onsite position requiring 5 days a week in a client office in Bridgewater, NJ The working arrangements for this role are accurate as of the date of posting. This may change based on the project you’re engaged in, as well as business and client requirements. Rest assured; we will always be clear about role expectations.

Requirements

  • 12+ years of experience in Kubernetes administration and DevOps.
  • Strong hands-on expertise in Docker Kubernetes or OpenShift in on-prem environments.
  • Deep understanding of container orchestration microservices and distributed systems.
  • Experience with Helm GitOps and secure credential management.
  • Proficiency in Linux administration Shell scripting and Python.
  • Experience with ClearML or similar AI workflow tools (e.g. MLflow Kubeflow).

Responsibilities

  • Perform cluster lifecycle operations including upgrades patching node provisioning and capacity planning.
  • Implement and manage RBAC network policies Pod-Security Policies and Namespaces for multi-tenant environments.
  • Maintain Ingress controllers service meshes and API gateways.
  • Troubleshoot cluster-level issues including node failures pod scheduling and resource bottlenecks.
  • Build and manage Docker images Compose files and private registries.
  • Deploy and orchestrate microservices using Kubernetes Helm charts and Red Hat OpenShift.
  • Optimize container resource usage autoscaling policies and affinity/anti-affinity rules.
  • Design and maintain CI/CD pipelines using DevOps tools
  • Automate deployment of containerized AI applications into Kubernetes clusters.
  • Develop reusable pipeline templates and scripts for rapid onboarding and POC delivery.
  • Integrate ClearML for experiment tracking model versioning and pipeline orchestration.
  • Collaborate with AI/ML teams to deploy containerized models and automate GPU job scheduling.
  • Build custom ClearML agents and workflows for reproducible experimentation and deployment.
  • Develop automation scripts in Shell Python
  • Build internal tools to streamline cluster operations and observability.

Benefits

  • Medical/Dental/Vision/Life Insurance
  • Paid holidays plus Paid Time Off
  • 401(k) plan and contributions
  • Long-term/Short-term Disability
  • Paid Parental Leave
  • Employee Stock Purchase Plan

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Mid Level

Education Level

No Education Listed

Number of Employees

5,001-10,000 employees

© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service