Senior DevOps Engineer, Application Reliability

Wellfit TechnologiesIrving, TX
23h$130,000 - $150,000Hybrid

About The Position

• We are seeking a Senior DevOps Engineer, Application Reliability to own and evolve the reliability, observability, and performance of our Azure-based application ecosystem. • This role sits between platform engineering and application development, focusing on application-level reliability rather than traditional infrastructure SRE. You’ll partner closely with engineering teams to ensure our systems are observable, resilient, and safe to deploy—especially as we accelerate delivery through AI-assisted development and modern deployment practices. • If you enjoy debugging real production issues, improving developer confidence through strong observability, and preventing incidents before they happen, this role is built for you.

Requirements

  • Bachelor’s degree in Computer Science, IT, or equivalent experience.
  • Azure Fundamentals (AZ-900) certification required.
  • Proven experience in application-focused DevOps or SRE roles, with deep hands-on Azure work.
  • Strong experience with Azure App Services, Application Insights, and Azure Monitor.
  • Ability to read, understand, and debug .NET/C# and Angular code.
  • Experience with incident response, on-call operations, and RCA writing.
  • Comfort operating in fast-moving environments with increasing deployment velocity.

Nice To Haves

  • Grafana, Prometheus, Datadog, or Dynatrace
  • Azure Front Door, CDN, Function Apps, WebJobs
  • Service Bus, Event Hub, or distributed messaging
  • Additional Azure certifications

Responsibilities

  • Design and implement end-to-end observability for Azure App Services using Application Insights, Azure Monitor, and distributed tracing.
  • Build and maintain actionable dashboards that surface application health, performance bottlenecks, and reliability risks.
  • Analyze logs, metrics, and traces to proactively identify issues and reduce MTTD / MTTR.
  • Partner with developers to instrument applications correctly and improve runtime visibility.
  • Serve as a subject-matter expert on Azure App Service architecture, scaling, and performance tuning.
  • Debug production issues at the application level, including C#, .NET, Angular, and SQL-related problems.
  • Use tools such as Diagnose and Troubleshoot, Kudu, and App Service diagnostics to resolve complex issues.
  • Automate monitoring, alerting, and common remediation workflows using PowerShell and Azure tooling.
  • Reduce operational toil through better tooling, alert quality, and self-service diagnostics.
  • Support safe daytime deployments, incremental rollouts, and environment stability.
  • Participate in on-call rotations and lead incident response for application reliability issues.
  • Write clear, high-quality root cause analyses (RCAs) focused on prevention, not blame.
  • Continuously improve reliability patterns as delivery velocity increases.
  • Act as a reliability partner to development teams—not a gatekeeper.
  • Document best practices for monitoring, debugging, and deployment safety.
  • Help lay the observability foundation required for AI-accelerated development workflows (BMAD).

Benefits

  • Medical, dental, vision, generous PTO.
  • Competitive Compensation: Salary, bonus eligibility, and 401(k) matching.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service