Engineer Jobs

10,000 jobs found — updated daily

Principal Site Reliability Engineer

VertaforeDenver, CO
Remote

About The Position

Vertafore is a leading technology company whose innovative software solutions are advancing the insurance industry. Our suite of products provides solutions to our customers that help them better manage their business, boost their productivity and efficiencies, and lower costs while strengthening relationships. Our mission is to move InsurTech forward by putting people at the heart of the industry. We are leading the way with product innovation, technology partnerships, and focusing on customer success. Our fast-paced and collaborative environment inspires us to create, think, and challenge each other in ways that make our solutions and our teams better. We are headquartered in Denver, Colorado, with offices across the U.S., Canada, and India. We are seeking a Principal Site Reliability Engineer to define the strategic vision and own the enterprise-wide reliability, scalability, and performance of our critical production services. As a foundational pillar of our engineering organization, this role drives architectural standards for the full-service lifecycle—from initial design and deployment readiness to proactive production operations. At Vertafore, we view reliability as a core engineering responsibility. You will operate autonomously across AWS, hybrid data centers, and customer-hosted environments, setting the technical direction for how we treat operations as a software engineering challenge. This role is pivotal in transitioning cross-departmental teams toward a highly proactive, engineering-first culture.

Requirements

  • 12 to 15+ years of hands-on Cloud Operations, SRE, or reliability-focused engineering experience, with a proven track record of end-to-end enterprise service ownership.
  • Demonstrated ability to operate at a Principal/Architect scope, driving large-scale reliability outcomes and operational excellence across global organizations.
  • Expert-level software engineering skills in C#, .NET, Java, Python, or React.
  • Deep expertise in scaling core SRE principles (SLIs, SLOs, error budgets) across complex, distributed systems.
  • Mastery of AWS, Kubernetes, CI/CD pipelines, Infrastructure-as-Code, and extensive knowledge of Linux and Windows environments and relational databases.
  • Bachelor’s or Master’s degree in Computer Science or a related technical field.
  • Participation in an executive on-call rotation with flexible hours as required
  • A fast learner.
  • A problem solver.
  • Ability to document procedures.
  • Able to meet deadlines.
  • Good communication skills.
  • Able to deliver the message effectively to a technical and non-technical audience.
  • Able to comply with processes and procedures.
  • Able to maintain professional composure in any situations.
  • Flexible in working extended hours on occasions or as required.
  • Driven to improve, personally and professionally.
  • Operate best in a fast-paced, flexible work environment with ability to work in a team.
  • High speed internet to accommodate working from home needs.
  • Occasional lifting and/or moving up to 10 pounds.
  • Frequent repetitive hand and arm movements required to operate a computer.
  • Specific vision abilities required by this job include close vision (working on a computer, etc.).
  • Frequent sitting and/or standing.
  • The selected candidate must be legally authorized to work in the United States.

Nice To Haves

  • Exposure in the insurance industry is desired but not mandatory.

Responsibilities

  • Define the standards for end-to-end service ownership, holding the organization accountable for availability, performance, and overall operational health.
  • Lead cross-departmental initiatives to influence system design at the architectural level, driving fault tolerance, strict compliance, and operational sustainability across public and private clouds.
  • Dictate the enterprise strategy for observability frameworks, ensuring the Four Golden Signals (Latency, Traffic, Errors, and Saturation) provide actionable, predictive insights across all platforms.
  • Establish the governance models for defining and managing SLIs and SLOs across multiple product lines.
  • Champion Error Budgets as the ultimate technical arbiter at the executive level, balancing feature velocity with the absolute requirement for platform stability.
  • Lead incident response for the most critical, high-severity events.
  • Foster a "Win Together" environment by championing a Blameless Postmortem culture globally, ensuring root cause analyses focus strictly on systemic and process improvements rather than individual error.

Benefits

  • Bonus

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Principal

Number of Employees

501-1,000 employees

Build a Resume for Engineer

The resume builder that gets results.

  • Get clear feedback so you look as qualified as you are
  • Align your resume with the job to get further in the process, faster
  • Take the guesswork out of resume writing

Explore Related Job Searches

© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service