DWX - Site Reliability & Automation Engineer

Group 1001
$180,000 - $230,000Remote

About The Position

Group 1001 is a consumer-centric, technology-driven family of insurance companies focused on delivering outstanding value and operational performance. This role is a dedicated Site Reliability & Automation Engineer for the DWX team, focused on automation, efficiency, and AI enablement to free up engineers to focus on core tasks. The engineer will identify high-leverage automation opportunities, build production-grade automated workflows, enable AI tools safely, and drive reliability practices within DWX. This role is crucial for improving engineering efficiency, preventing incidents, and unlocking new capabilities.

Requirements

  • 7+ years in SRE, platform engineering, or DevOps roles, ideally with experience in regulated environments (L&A, insurance, or financial services preferred).
  • Proven experience building and running automation at scale.
  • Demonstrated ability to make systems measurably more reliable, eliminate processes through engineering, and accelerate teams through tooling.
  • Expert-level command of Infrastructure-as-code (Terraform, Bicep, ARM, or equivalent).
  • Expert-level command of CI/CD pipelines (Azure DevOps, GitHub Actions) with real-world experience in testing, gating, and progressive delivery.
  • Expert-level command of Scripting and development (PowerShell, Python, and at least one general-purpose language).
  • Expert-level command of Git (branching strategies, code review discipline, repo hygiene).
  • Expert-level command of Cloud platforms (Azure deeply; AWS or GCP a plus).
  • Expert-level command of Observability (Azure Monitor, Log Analytics, KQL, Application Insights, or comparable stacks).
  • Expert-level command of API and integration work (Graph API, REST, webhooks, event-driven patterns).
  • Genuine, current depth in AI and automation, including hands-on experience deploying LLM-powered workflows in enterprise environments.
  • Understanding of agentic systems, MCP, retrieval-augmented patterns, and AI safety practices.
  • A point of view on which AI use cases are real and which are hype.
  • Experience with AI governance (prompt safety, data classification, output validation, audit trails).
  • Demonstrated digital transformation mindset: experience helping organizations move from ticket-driven to product-driven operations, from runbook-driven to code-driven, and from reactive to proactive.
  • Ability to build the business case as well as the system.
  • A strong aptitude for identifying automation opportunities and a passion for improving the lives of other engineers.

Nice To Haves

  • Experience implementing SRE practice from scratch in an organization that didn't previously have it.
  • Background in platform engineering (building internal developer platforms, golden paths, paved roads).
  • AI/ML engineering experience beyond prompt-level work (fine-tuning, evaluation, RAG architectures, agent frameworks).
  • Certifications: Azure Solutions Architect Expert, Azure DevOps Engineer Expert, HashiCorp Terraform Associate, or equivalent.
  • Experience with ServiceNow scripting and flow designer at a meaningful depth.
  • Background contributing to or maintaining open source automation or SRE tooling.
  • Experience in regulated industries with appreciation for the compliance constraints around AI and automation.

Responsibilities

  • Identifying the highest-leverage automation and AI-enablement opportunities within DWX by embedding with engineers, observing work, and analyzing data.
  • Engineering end-to-end automated workflows using Infrastructure-as-Code (Terraform, Bicep), CI/CD (Azure DevOps, GitHub Actions), configuration-as-code, and orchestration platforms (Logic Apps, Power Automate, ServiceNow flows, custom services).
  • Ensuring automated code is production-grade, version-controlled, peer-reviewed, observable, and resilient.
  • Safely enabling AI tools (Copilot, Claude, internal LLM platforms, agentic systems) across DWX by defining patterns for AI-assisted troubleshooting, AI-augmented runbooks, prompt libraries, agent workflows, and necessary guardrails.
  • Driving SRE practices within DWX, including meaningful SLIs and SLOs, robust observability, effective post-incident learning, and chaos/resilience thinking.
  • Partnering with the Senior Support Manager on problem management to convert recurring incidents into automation backlog.
  • Collaborating with Solutions Engineers, Operations Engineers, Microsoft Engineer, Cybersecurity, Networks, Architecture, Product Management, and Business Technology teams to enhance their effectiveness.

Benefits

  • Comprehensive health, dental, and vision insurance plan options.
  • Basic and Supplemental Life Insurance.
  • Short and Long-Term Disability.
  • Employee Assistance Program.
  • Wellness programs.
  • 401K plan with matching contributions.
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service