About The Position

As a DevOps engineer you will provide L3 support across Edge/on‑prem and cloud troubleshooting to resolve escalated incidents and collaborate with senior engineers and Product to validate fixes and prevent recurrences. Having a good understanding of our product, its components and their interactions is essential in troubleshooting and problems remediation. Good Linux administration (RHEL primarily, plus Ubuntu) and OpenShift/Kubernetes experience are essential. You will build lightweight automations (Python, Bash, Ansible) to reduce Operations (Customer Deployment) issues. When not handling L3 work, you’ll execute safe cloud deployments and upgrades via existing GitOps/IaC pipelines (Flux, Ansible, Terraform), offering feedback and minor adjustments. You’ll maintain and tune Alertmanager rules and improve Grafana dashboards to ensure actionable, low‑noise alerting with Prometheus/Grafana. You will automate and improve SOPs, and support knowledge sharing through workshops, training, and clear documentation, contribute to infrastructure testing and security/vulnerability remediation with the DevOps engineering and Security teams.

Requirements

  • Experience: 3+ years in DevOps/SRE or similar operations‑focused roles with strong automation experience.
  • Networking: Experience in DNS, routing, container communication, firewalls, reverse-proxying, load-balancing, edge to cloud communication and troubleshooting.
  • System Administration: Good system administration skills are required for deploying and troubleshooting OS level outages and Everseen’s containerized Edge application in customer network.
  • Cloud Expertise: Proven experience with Azure (or GCP), including fully automated infrastructure and deployment.
  • CI/CD Pipelines: Proven experience in implementing and managing CI/CD pipelines (GitLab CI/CD preferred) and good knowledge of Git and associated workflows (e.g., Gitflow).
  • Observability: Proven experience with monitoring, logging, and alerting tools and stacks.
  • Scripting: Good scripting skills in Bash and Python.
  • Containerization: Good knowledge of Kubernetes and Openshift, including cluster management, orchestration and auto-scaling, deployments using Helm charts and GitOps.
  • Microservices Experience: Proven experience with microservices architecture and related deployment strategies.
  • Infrastructure as Code: Experience working with Terraform modules.
  • Configuration management: Experience with Ansible, maintaining and writing playbooks and roles.
  • Security Practices: Understanding of DevSecOps principles and experience implementing security best practices within CI/CD pipelines.
  • Analytical and Problem-Solving Skills
  • Possesses strong analytical and problem-solving abilities, leveraging data to inform product decisions. This skill is essential for identifying market opportunities, optimizing product features, and addressing challenges effectively.
  • Communication Skills
  • Excellent presentation, oral, and written communication skills. Fluent business English is a requirement.
  • Customer Focus
  • A passionate advocate for determining and delivering solutions with a high level of customer satisfaction.
  • Ability to prioritize customer experience as a top priority in solution delivery.
  • Interest in Learning and Growth Mindset
  • Demonstrated interest in learning and a strong desire to expand knowledge in their respective field.
  • Eagerness to explore new technologies, methodologies, and best practices to enhance skills and capabilities.
  • Results-oriented attitude, with a drive to achieve objectives efficiently.
  • Technical Leadership
  • Capable of engaging in technical discussions with stakeholders and leading DevOps projects. Mentors and coaches team members.

Nice To Haves

  • Experience in using Service Mesh solution like Istio
  • Experience in using Tracing solutions like Grafana Tempo, Jaeger
  • Experience using OTEL
  • Experience with RenovateBot or similar tools
  • Experience with node.js

Responsibilities

  • Designs and maintains CI/CD pipelines using GitLab CI/CD.
  • Implements Infrastructure as Code (IaC) with tools like Terraform.
  • Manages basic deployments and assists in CI/CD process improvements.
  • Writes and executes simple automation scripts (e.g., Ansible playbooks).
  • Troubleshoots and optimizes Kubernetes cluster operations.
  • Write and maintain system operations documentation (articles, diagrams, data flows, etc.) for new and existing applications and services.
  • Keep up-to-date on best practices and new technologies.
  • Conducts, designs, and executes staging/UAT/production and mass service deployment scenarios.
  • Collaborates on technical architecture and system design.
  • Analyzes and collects data: log files, application stack traces, thread dumps, etc.
  • Reproduces and simulate application incidents to create debug reports and coordinate delivery of application fixes.
  • Works in off-routine hours occasionally.
  • Works with customers and travel to international customer or partner locations.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Mid Level

Education Level

No Education Listed

Number of Employees

1,001-5,000 employees

© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service