About The Position

We are seeking a foundational Site Reliability Engineer to join our Device Insurance Technology team as we build a new internal engineering capability. This role is a unique opportunity to help establish our DevOps and SRE practices from the ground up within a modern, cloud-native environment. As the first dedicated SRE on the team, you will play a critical role in designing, building, and owning CI/CD pipelines, deployment processes, and production observability systems. You will work closely with development teams, architects, and external partners to transition operational ownership from legacy systems and enable scalable, reliable service delivery. This is a hands-on, high-impact engineering role where you will balance building foundational systems with supporting live production environments. You will help shape the future operating model, drive automation, and influence how reliability is implemented across the platform.

Requirements

  • 4+ years of experience in DevOps and SRE role.
  • Experience in developing and maintaining CI/CD pipelines for software deployment.
  • Experience with Gitlab pipelines and helm.
  • 4+ years - Implementing and managing cloud-native platforms and solutions.
  • Hands-on experience with containerization (Docker, Kubernetes).
  • 4+ years Hands-on experience with monitoring/logging tools such as Splunk, Grafana, OpenTelemetry and incident management.
  • 4+ years - Guiding and mentoring teams in reliability engineering practices.
  • Understanding of web protocols, how full stack applications operate and data flows
  • Basic knowledge of at least one major cloud platform (AWS preferred).
  • Strong communication skills and ability to work under pressure.
  • Bachelor's Degree plus 3 years of related work experience OR advanced degree with 1 year of related work experience OR combination of education and experience deemed equivalent (Required)
  • Acceptable areas of study include Computer Science, Engineering or related field (Required)
  • At least 18 years of age
  • Legally authorized to work in the United States

Nice To Haves

  • Experience integrating DevSecOps tools like code scanning, policy enforcement or container image validation.
  • Understanding of blue/green, canary or rolling deployment strategies.
  • Exposure to artifact management, secrets management or GitOps workflows.
  • Exposure to incident management frameworks including alerting, escalation and postmortem practices.
  • Understanding of Agile methodologies to improve and streamline processes.
  • Ability to analyze system performance data to identify trends and improvement opportunities.
  • Capability to drive innovation in system management and operations through new technologies and approaches.
  • Ability to adapt to new technologies and changes in the digital landscape to maintain system robustness.
  • Experience using generative AI tools (e.g., Claude, GitHub Copilot) for development support and task acceleration.
  • AWS Certified DevOps Engineer This certification validates technical expertise in provisioning, operating, and managing distributed application systems on the AWS platform. (Preferred)
  • Certified Kubernetes Administrator This certification validates the skills required for day-to-day administration of Kubernetes environments. (Preferred)
  • Google Cloud Certified - Professional DevOps Engineer This certification validates the ability to efficiently develop and deploy applications using Google Cloud technologies and to manage operations. (Preferred)

Responsibilities

  • Develop, configure, and support CI/CD pipelines.
  • Automate build, test, and deployment workflows to enable safe and repeatable releases.
  • Integrate automated quality checks, code scanning, and deployment validations into pipelines.
  • Support containerized deployments using Docker and Kubernetes.
  • Use Infrastructure-as-Code (IaC) tools like Helm to manage cloud infrastructure.
  • Participate in automated provisioning of environments and system configurations.
  • Embed monitoring and alerting into delivery pipelines.
  • Support debugging of build, deployment, and environment issues across Dev/Test/Prod systems.
  • Automate processes to enhance system reliability and resilience.
  • Minimize operational incidents through proactive monitoring and maintenance.
  • Develop scripts, tools and automation to reduce manual efforts in operational tasks.
  • Manage incident response to ensure rapid recovery and minimal disruption.
  • Help build and maintain dashboards, alerts, and logs that provide visibility into system health and application behavior.
  • Use tools such as Prometheus, Grafana, Splunk, or OpenTelemetry to monitor services and infrastructure.
  • Analyze system performance data to guide optimizations and proactively detect issues.
  • Adapt to new technologies to maintain and enhance system robustness.
  • Contribute to documentation, runbooks, playbooks, and operational readiness reviews.

Benefits

  • Employees enjoy multiple wealth-building opportunities through our annual stock grant, employee stock purchase plan, 401(k), and access to free, year-round money coaches.
  • Employees in regular, non-temporary roles are eligible for an annual bonus or periodic sales incentive or bonus, based on their role.
  • Most Corporate employees are eligible for a year-end bonus based on company and/or individual performance and which is set at a percentage of the employee’s eligible earnings in the prior year.
  • Certain positions in Customer Care are eligible for monthly bonuses based on individual and/or team performance.
  • Full and part-time employees have access to the same benefits when eligible. We cover all of the bases, offering medical, dental and vision insurance, a flexible spending account, 401(k), employee stock grants, employee stock purchase plan, paid time off and up to 12 paid holidays - which total about 4 weeks for new full-time employees and about 2.5 weeks for new part-time employees annually - paid parental and family leave, family building benefits, back-up care, enhanced family support, childcare subsidy, tuition assistance, college coaching, short- and long-term disability, voluntary AD&D coverage, voluntary accident coverage, voluntary life insurance, voluntary disability insurance, and voluntary long-term care insurance.
  • eligible employees can also receive mobile service & home internet discounts, pet insurance, and access to commuter and transit programs!
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service