Sr TechOps and SRE Lead (AWS Cloud)- REMOTE

Simple SolutionsSaint Augustine, FL
12dRemote

About The Position

We are seeking a highly experienced Sr TechOps and SRE Lead with deep expertise in Cloud to lead our cloud infrastructure, DevOps practices, Site Reliability "Best Practices", and overall operational excellence initiatives. This role is both strategic and hands-on — responsible for designing scalable architectures, improving automation, ensuring system reliability, and leading the TechOps team.

Requirements

  • Bachelor’s degree in Computer Science, Engineering, or equivalent experience.
  • 10+ years in DevOps, Cloud Engineering, or Infrastructure roles.
  • 5+ years leading technical teams.
  • Strong hands-on experience with AWS services (EC2, EKS, RDS, S3, IAM, VPC, Lambda).
  • Deep knowledge of networking, Linux systems, and distributed systems.
  • Experience with Infrastructure-as-Code (Terraform or CloudFormation).
  • Strong scripting skills (Python, Bash, or similar).
  • Experience with containerization (Docker) and Kubernetes (EKS preferred).

Responsibilities

  • Architect and manage secure, scalable, and highly available infrastructure on AWS.
  • Design multi-account AWS environments using AWS Organizations.
  • Implement VPC architecture, IAM policies, networking, and security best practices.
  • Oversee EC2, ECS/EKS, Lambda, RDS, S3, CloudFront, and related AWS services.
  • Optimize AWS cost management and resource utilization.
  • Implement Site Reliability Engineering (SRE) best practices.
  • Define SLIs, SLOs, and error budgets.
  • Manage monitoring and alerting (CloudWatch, Datadog, Prometheus, Grafana).
  • Lead incident response, root cause analysis (RCA), and postmortems.
  • Ensure 24/7 uptime and operational resilience.
  • Implement IAM best practices and least-privilege access controls.
  • Manage secrets and key management (AWS KMS, Secrets Manager).
  • Conduct vulnerability management and patching.
  • Support compliance initiatives (SOC 2, ISO 27001, GDPR as applicable).
  • Lead disaster recovery planning and backup strategies.
  • Lead and mentor a team of DevOps/TechOps engineers.
  • Establish operational KPIs and performance benchmarks.
  • Manage on-call rotations and escalation processes.
  • Collaborate with Engineering, Product, Security, and Data teams.
  • Contribute to long-term infrastructure strategy and cloud roadmap.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service