Site Reliability Engineer Experienced, Senior or Lead

BoeingEnglewood, CO
89d$108,800 - $209,300Hybrid

About The Position

You are a Site Reliability Engineer Experienced, Senior or Lead with experience building and maintaining large scale solutions. You will work closely with the Support, Customer Success and Software Development organizations to build and maintain modern infrastructure and operations solutions for critical business services. Collaboration, communication, and passion for change are your strengths. You will be joining a team responsible for implementing automation, monitoring, scaling, and Infrastructure as Code solutions both in local datacenters and cloud platforms.

Requirements

  • 3+ years experience in the information technology industry with focus on Infrastructure or Operations support
  • 2+ years experience in SRE and/or DevOps
  • Proficient in technologies such as Kubernetes, Istio, Rancher, Git, Helm, Ansible, Chef, Docker, Prometheus, Grafana, and AppDynamics
  • Strong understanding of at least one cloud platform such as AWS, GCP, or Azure
  • Experience Troubleshooting complex cloud infrastructure problems and Linux OS issues
  • Some basic understanding of Network security fundamentals
  • Good verbal and written communication skills, with ability to work with both technical and non-technical stakeholders
  • Bachelor's degree in Computer Science or related field preferred or equivalent work experience
  • Proficient in LINUX and Windows commands, developing scripts using shell programming with Bash or PowerShell, and additional scripting languages like Python and Java Script
  • Experience using SDLC and Agile methodologies, with focus on Scrum and Kanban
  • Confident in a fast-paced environment with competing priorities, and able to multi-task and manage expectations

Nice To Haves

  • 4+ years experience in the information technology industry with focus on Infrastructure or Operations support
  • 5+ years experience in the information technology industry with focus on Infrastructure or Operations support in the information technology industry with focus on Infrastructure or Operations support
  • Applicable and appropriate educational/certification credentials from an accredited institution and/or equivalent experience

Responsibilities

  • Contribute to reliability, quality, security, supportability, and scalability of applications
  • Support effective resolution of large scale incidents that reduces MTTR and customer impact
  • Participate in post-mortem and root cause analyses
  • Manage complete lifecycle of production environments, including necessary configurations, integrations, and application admin activities (operational, migrations, and upgrades)
  • Assist in troubleshooting of performance, integration and user management issues by digging into application and system logs
  • Develop procedures and scripts for monitoring and automation of manual processes
  • Engage with Software Development and Customer Success organizations to troubleshoot issues and participate in any planned activities
  • Follow ITSM processes for Incident, Request, and Change Management process
  • Maintain system documentation for configuration and troubleshooting of known issues
  • Participate in on-call rotation when required to provide support for urgent, off-hour issues
  • Implement ‘self-healing' capabilities to limit after-hours and on-call needs
  • Facilitate research, evaluation, and design of new software solutions

Benefits

  • Competitive base pay and variable compensation opportunities
  • Health insurance
  • Flexible spending accounts
  • Health savings accounts
  • Retirement savings plans
  • Life and disability insurance programs
  • Programs that provide for both paid and unpaid time away from work
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service