Site Reliability Engineer III

onXBozeman, MT
2h$130,000 - $153,000Remote

About The Position

As a pioneer in digital outdoor navigation with a suite of apps, onX was founded in Montana, which in turn has inspired our mission to awaken the adventurer inside everyone. With more than 400 employees located around the country working in largely remote / hybrid roles, we have created regional “Basecamps” to help remote employees find connection and inspiration with other onXers. We bring our outdoor passion to work every day, coupling it with industry-leading technology to craft dynamic outdoor experiences. Through multiple years of growth, we haven’t lost our entrepreneurial ethos at onX. We offer a fast-paced, growing, tech-forward environment where ownership, accountability, and passion for winning as a team are essential. We value diversity and believe it leads to different perspectives and inspires both new adventures and new growth. As a team, we’re hungry to improve, value innovation, and believe great ideas come from any direction. onX is seeking a Site Reliability Engineer to build and maintain the infrastructure that enables our developers to ship reliably at scale. You'll manage onX's infrastructure platform, deployment automation, and observability through infrastructure-as-code—keeping systems reliable and performant while maintaining a simple path to production for development teams. This is a great opportunity to work on infrastructure that directly impacts millions of outdoor enthusiasts. This position will report to the Principal Site Reliability Engineer. As an onX Site Reliability Engineer, your day to day responsibilities would look like:

Requirements

  • You have a B.S. or M.S. in computer science or a related field or relevant experience
  • You have at least 5+ years of experience where 3+ are supporting production systems
  • You have a strong interest and experience with Kubernetes, networking, and infrastructure-as-code.
  • You have experience with Terraform/OpenTofu
  • You have exposure to at least one major cloud platform
  • You evaluate technologies and solutions based on merit, stability, performance and the ability to debug
  • You have practical experience with different types of datastores (SQL, NoSQL, object storage) and can explain when to use each based on data access patterns and scalability needs
  • You have a strong computer science foundation
  • You believe that your profession is a craft and you’re driven to improve every day
  • You take strong ownership of your work and platform responsibilities

Nice To Haves

  • Familiarity with Google Cloud Platform
  • Strong ability to troubleshoot and break down issues
  • Experience working with high throughput, low latency services
  • Experience working with a distributed team
  • Experience working with IAM, auditing & security management within a cloud environment
  • Experience working with GIS Mapping systems and tiles
  • Experience working with Claude Code
  • Experience working with Airflow or equivalent ETL systems

Responsibilities

  • Deploy, monitor and maintain highly available systems using technologies such as Terraform, CockroachDB and GCP services to include GKE(Kubernetes), Cloud SQL, Bigtable, Google Composer (Airflow), Google Cloud Storage, BigQuery, Pub/Sub, Cloud Run, etc.
  • Maintain and extend a large, mature Terraform codebase.
  • Analyze systems and make recommendations to increase performance, availability and minimize cost.
  • Automate manual systems to minimize toil wherever possible.
  • Develop and maintain integrations with 3rd party monitoring and alerting systems, such as Google Cloud Monitoring, Prometheus, OpenTelemetry, Checkly, and Rootly.
  • Drive incident response best practices for on-call engineering teams across onX. Participate in the SRE team's on-call rotation for core infrastructure.
  • Collaborate in architectural decisions and direction involving our services and initiatives.

Benefits

  • Competitive salaries, annual bonuses, equity, and opportunities for growth
  • Comprehensive health benefits including a no-monthly-cost medical plan
  • Parental leave plan of 5 or 13 weeks fully paid
  • 401k matching at 100% for the first 3% you save and 50% from 3-5%
  • Company-wide outdoor adventures and amazing outdoor industry perks
  • Annual “Get Out, Get Active” funds to fuel your active lifestyle in and outside of the gym
  • Flexible time away package that includes PTO, STO, VTO, quiet weeks, and floating holidays
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service