Cloud Operations Analyst

TEKsystemsFarmington Hills, MI
Hybrid

About The Position

The Cloud Operations Analyst is responsible for leading the management, optimization, and automation of cloud and on-premises infrastructure to ensure seamless operations and business continuity. This role includes driving improvements in observability, server and batch operations, and data center management while proactively identifying and resolving performance and reliability issues. The Cloud Operations Analyst provides technical leadership, mentors team members, and consults with cross-functional teams to enhance operational excellence through best practices, process enhancements, and cutting-edge technologies.

Requirements

  • Bachelor’s degree in computer science, Information Technology, or a related field, or equivalent experience.
  • 5+ years of experience working with monitoring and observability tools (e.g., Datadog, PagerDuty).
  • Certified PagerDuty Administrator or equivalent experience required.
  • 5+ years of experience in cloud operations or server management roles.
  • 5+ years of progressive server administration experience (Windows, Linux).
  • 5+ years of experience in designing, implementing, and managing IT workload automation solutions to optimize scheduling, orchestration, and execution of enterprise workflows across on-prem and cloud environments.
  • Experience leveraging artificial intelligence to drive innovation and solve complex problems. Demonstrated ability to utilize AI-driven solutions that optimize processes, enhance decision-making, or create transformative business outcomes.
  • Demonstrated experience working with Infrastructure as Code (Terraform, CloudFormation, and Ansible).
  • 5+ years working with cloud platforms (AWS, Azure, OCI).
  • Certified AWS SysOps Administrator or equivalent experience required.
  • Strong experience with data center infrastructure and best practices.
  • Proficiency in scripting and automation tools (Python, Bash, PowerShell).
  • Strong understanding of networking, security, and identity management in cloud environments.
  • Working knowledge of security best practices and compliance standards.
  • Working knowledge of agile methodologies.
  • Excellent troubleshooting, problem-solving, and communication skills.

Nice To Haves

  • Legacy workload automation platform (Tidal) maintenance
  • Future plans for AWS RDS

Responsibilities

  • Independently develop, implement, and maintain observability tools to monitor cloud and on-premises systems.
  • Actively support infrastructure teams in the management and maintenance of server systems running on Windows and Linux.
  • Create dashboards, alerts, and reports to track system health, performance, and availability.
  • Analyze metrics and logs to identify trends, prevent potential issues, and optimize system performance.
  • Act as the lead consultant with FinOps teams to monitor resource utilization and ensure cost-effective operations across cloud environments.
  • Manage the lifecycle of cloud and on-premises servers, including provisioning, patching, configuration, and decommissioning.
  • Troubleshoot and resolve server-related issues, ensuring minimal downtime. Implement and enforce server security policies and compliance requirements.
  • Schedule, monitor, and manage batch processes to ensure timely execution of critical tasks.
  • Identify and resolve batch failures or delays, coordinating with relevant teams to ensure smooth operations.
  • Building new batch jobs for improved performance and resource utilization.
  • Lead on-site and remote data center operations, ensuring proper functioning of hardware, power, cooling, and network infrastructure.
  • Coordinate with vendors and service providers for hardware maintenance, replacements, and upgrades.
  • Participate in on-call rotations to address system incidents and outages promptly.
  • Conduct root cause analysis and implement solutions to prevent recurrence of issues.
  • Document and communicate incident resolution processes to relevant stakeholders.
  • Work closely with cross-functional teams, including DevOps, Networking, and Application Development, to implement and maintain system integrations.
  • Maintain comprehensive documentation for configurations, processes, and incident resolutions.
  • Provide training and support to team members and other departments.

Benefits

  • Medical, dental & vision
  • Critical Illness, Accident, and Hospital
  • 401(k) Retirement Plan – Pre-tax and Roth post-tax contributions available
  • Life Insurance (Voluntary Life & AD&D for the employee and dependents)
  • Short and long-term disability
  • Health Spending Account (HSA)
  • Transportation benefits
  • Employee Assistance Program
  • Time Off/Leave (PTO, Vacation or Sick Leave)
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service