Technical Operations Engineer

MedlytixRoswell, GA

About The Position

The Technical Operations Engineer is responsible for supporting the performance, reliability, and visibility of the Medlytix Production System. This role serves as a hybrid technical function combining production monitoring, telemetry analysis, workflow orchestration support, and automation engineering. Working under the direction of the Director, this individual plays a critical role in maintaining operational health across distributed systems, data pipelines, and workflow orchestration environments. The position requires strong hands-on expertise with monitoring tools, telemetry platforms, cloud technologies, and data processing systems. The Technical Operations Engineer evaluates system behavior, investigates production issues, supports and maintains monitoring of production systems, and drives automation and reliability improvements. The ideal candidate is highly proficient in relevant technical tools and platforms, with the ability to effectively monitor, analyze, and improve production systems. This role also requires strong critical thinking and problem-solving skills to evaluate complex system behaviors, identify root causes, and implement effective solutions across interconnected systems.

Requirements

  • Bachelor's degree in Computer Science, Information Systems, Engineering, Data Science, or related field, with 3+ years of experience in technical operations, data engineering, or business intelligence
  • Strong proficiency in SQL and experience with Python or scripting for troubleshooting, analysis, and automation
  • Hands-on experience with workflow orchestration tools (e.g., Airflow) and data pipelines
  • Familiarity with cloud platforms (AWS preferred) and monitoring/observability tools (e.g., Datadog, CloudWatch)
  • Proven ability to perform root cause analysis and troubleshoot complex issues across distributed systems
  • Strong critical thinking and problem-solving skills with the ability to quickly learn and apply new tools and technologies
  • Effective communication skills with the ability to translate technical findings into actionable insights

Nice To Haves

  • Exposure to ML/AI concepts, tools, or operational use cases is a plus

Responsibilities

  • Monitor systems, workflows, and data pipelines to ensure optimal performance, high data quality, and system reliability
  • Build and maintain monitoring dashboards, alerts, and observability frameworks using telemetry tools
  • Analyze workflow performance metrics (latency, failures) and identify trends or anomalies
  • Support workflow orchestration platforms (e.g., Airflow) to ensure successful job execution and dependency management
  • Troubleshoot workflow failures, data pipeline issues, and system disruptions across distributed environments
  • Perform root cause analysis using logs, telemetry data, and execution history, and provide actionable recommendations
  • Manage and respond to production incidents, including triage, escalation, and coordination with cross-functional teams
  • Ensure data quality and integrity by implementing validation checks and identifying anomalies early
  • Develop automation scripts and tools to reduce manual operational effort and improve efficiency
  • Identify opportunities to improve system reliability, fault tolerance, and operational scalability
  • Collaborate with Engineering, Product, and Data teams to resolve issues and enhance system performance
  • Communicate technical findings clearly and contribute to operational reporting and dashboards
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service