Senior Software Engineer

eMedMiami, FL

About The Position

As a Senior Software Engineer in SRE at eMed, you will play a key role in ensuring our platform is highly available, secure, and performant. You’ll lead reliability engineering efforts across production systems, drive operational excellence, and collaborate closely with application and infrastructure teams to design resilient services. This role suits an engineer with a software mindset and deep operational experience, who thrives on improving systems through automation and proactive engineering.

Requirements

  • Strong experience operating Kubernetes and cloud-native infrastructure (preferably EKS on AWS) in production environments
  • Proficiency in AWS services, including networking, compute, IAM, and logging/monitoring tools (e.g. CloudWatch, ELB, VPC)
  • Skilled in Terraform and Infrastructure as Code practices
  • Deep understanding of observability tooling (metrics, logs, tracing) and incident management workflows
  • Strong coding skills for building tools, scripts, and automation
  • Ability to troubleshoot complex infrastructure issues and lead delivery of reliable cloud solutions

Nice To Haves

  • Experience implementing SLAs, SLOs, and error budgets to guide operational priorities
  • Background in healthcare or other regulated industries with security and compliance requirements
  • Previous involvement in platform security reviews

Responsibilities

  • Design and implement robust monitoring, alerting, and observability systems across all services and infrastructure
  • Lead reliability reviews, incident response, and post-incident analysis—focusing on prevention, learning, and long-term improvements
  • Improve service scalability, fault tolerance, and performance through architectural input and systems optimisation
  • Build and maintain automation for infrastructure management using Terraform, and delivery pipelines using GitHub Actions
  • Partner with software engineers to improve the operational readiness and resilience of services, including capacity planning and runbooks
  • Lead initiatives to reduce operational toil through tooling, automation, and process improvement
  • Manage and optimise our production Kubernetes and AWS environments with a focus on reliability, security, and cost-effectiveness
  • Contribute to security hardening efforts, including network controls, secrets management, and compliance readiness
  • Participate in and lead in-person stand-ups, incident reviews, and cross-team planning sessions
  • Share knowledge and mentor engineers on best practices in observability, incident response, and operational engineering

Benefits

  • Health Care Plan (Medical, Dental & Vision)
  • Retirement Plan (401k with Company Match)
  • Life Insurance (Basic, Voluntary & AD&D)
  • Paid Time Off
  • Short Term & Long Term Disability
  • Training & Development
  • Catered Breakfast and Lunch 5 days a Week
  • Wellness Resources
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service