Senior Site Reliability Engineer II

RemitlyFL
72d$102,800 - $171,300

About The Position

Are you a collaborative Azure Sr SRE looking to work for a mission driven global organization? Do you possess advanced Azure SRE skills and looking to put those skills to use to help drive innovation? LexisNexis® Risk Solutions provides customers with innovative technologies, information-based analytics, decisioning tools and data management services that help them solve problems, make better decisions, stay compliant, reduce risk and improve operations. Headquartered in metro-Atlanta, Georgia it operates within the Risk market segment of RELX, a global provider of information-based analytics and decision tools for professional and business customers. This Azure SRE will administer (Azure) AKS clusters running critical always-on middleware handling thousands of TPS. They will be expected to conduct operations in a manner consistent with a five-9’s availability target. This team is entrusted with applying software engineering practices to IT operations tasks to maintain a scalable and reliable production environment. This team also automates recovery to protect critical service levels.

Requirements

  • Current and extensive experience as an Azure SRE.
  • Possess a deep understanding of IaC configuration design, software defined networking and infrastructure, and observability platforms such as ELK, Grafana Loki, and/or OpenTelemetry.
  • Possess an understanding of how to observe distributed systems and their dependencies, and how to automate recovery to protect service levels.
  • Proficiency in scripting and automation (e.g., Python, Bash, Ansible).
  • Knowledge of running software developed in languages such as Java, C++, .NET, and/or Node.js is expected.
  • Experience with Kubernetes, Terraform, Helm, GitHub (Actions), ArgoCD, and IaC automation as well as solid Linux and cloud networking skills.

Responsibilities

  • Designing, writing, and maintaining Infrastructure as Code (IaC) using Terraform and Helm to provision and manage cloud environments (AWS, Azure, or GCP).
  • Creating and maintaining Helm charts for Kubernetes deployments, ensuring consistency across environments.
  • Developing modular, reusable Terraform templates for VPCs, subnets, clusters, and observability tooling.
  • Conducting code reviews for Terraform and Helm changes to ensure compliance, reusability, and security.
  • Individuals are responsible for challenging reliability and toil reduction projects.
  • Contributing to process improvements through experience and knowledge.
  • Deploying AKS cluster and cutovers, base image updates, testing IaC changes, and other work focused on daily operations.

Benefits

  • Health Benefits: Comprehensive, multi-carrier program for medical, dental and vision benefits.
  • Retirement Benefits: 401(k) with match and an Employee Share Purchase Plan.
  • Wellbeing: Wellness platform with incentives, Headspace app subscription, Employee Assistance and Time-off Programs.
  • Short-and-Long Term Disability, Life and Accidental Death Insurance, Critical Illness, and Hospital Indemnity.
  • Family Benefits, including bonding and family care leaves, adoption and surrogacy benefits.
  • Health Savings, Health Care, Dependent Care and Commuter Spending Accounts.
  • In addition to annual Paid Time Off, we offer up to two days of paid leave each to participate in Employee Resource Groups and to volunteer with your charity of choice.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service