Principal Site Reliability Engineer

GreystarColumbia, SC
96d

About The Position

We are seeking a Principal Data Engineer to lead the technical strategy and greenfield build of our Azure platform. This is a hands-on DevOps role with a strong focus on Azure system maintenance and tracking. You’ll be setting up new environments, automating operations, and keeping services reliable, observable, secure, and cost-efficient at scale. This is a unique opportunity to establish a solid technical foundation and influence the direction of our technology stack.

Requirements

  • 5+ years in software engineering, network engineering, DevOps, or systems administration.
  • 3+ years operating large-scale cloud services in Azure.
  • Strong DevOps skills: Terraform/Bicep, GitOps/CI-CD, and scripting (Python/Go/Bash/PowerShell).
  • Deep AKS operations (Helm, upgrades, HPA/KEDA, backups) and Azure networking (VNets/peering, Private Link, App Gateway/WAF, Front Door/DNS).
  • Observability expertise (Azure Monitor, Log Analytics/KQL, App Insights, Prometheus/Grafana).
  • Proven incident management, change control, and documentation/runbook craft.
  • Comfort building net-new systems in a fast-moving, startup-like environment.

Nice To Haves

  • Takes initiative and drives projects from conception to completion.
  • Strong communication skills and ability to collaborate across teams.
  • Background in investment management or multifamily real estate.

Responsibilities

  • Build & Deploy: Azure landing zones, AKS, networking, and CI/CD from scratch (Terraform/Bicep, GitHub Actions/Azure DevOps).
  • Operate & Scale: Patching, upgrades, backup/DR, inventory/tagging, drift detection, and config/state tracking (e.g., Azure Policy, Resource Graph).
  • Reliability & Observability: Define SLOs/error budgets; instrument metrics/logs/traces with Azure Monitor, Log Analytics, App Insights, and Prometheus/Grafana; create actionable alerts and runbooks.
  • Influence the product architecture and roadmap to ensure customer-experienced supportability is a key consideration.
  • Collaborate closely with engineering teams on building and enhancing tooling and automation solutions for faster resolution of issues impacting SLOs.
  • Collaborate with customers to understand their pain points around Supportability and SLO attainment and formulate strategies for addressing recurring issues.

Benefits

  • Competitive Medical, Dental, Vision, and Disability & Life insurance benefits.
  • Low (free basic) employee Medical costs for employee-only coverage; costs discounted after 3 and 5 years of service.
  • Generous Paid Time off: 15 days of vacation, 4 personal days, 10 sick days, and 11 paid holidays.
  • Birthday off after 1 year of service.
  • Additional vacation accrued with tenure.
  • 6-Week Paid Sabbatical after 10 years of service (and every 5 years thereafter).
  • 401(k) with Company Match up to 6% of pay after 6 months of service.
  • Paid Parental Leave and lifetime Fertility Benefit reimbursement up to $10,000.
  • Employee Assistance Program.
  • Critical Illness, Accident, Hospital Indemnity, Pet Insurance and Legal Plans.
  • Charitable giving program and benefits.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service