ECM - SRE Principal Engineer

Motorola SolutionsChicago, IL
1d

About The Position

At Motorola Solutions, we believe that everything starts with our people. We’re a global close-knit community, united by the relentless pursuit to help keep people safer everywhere. We build and connect technologies to help protect people, property and places. Our solutions foster the collaboration that’s critical for safer communities, safer schools, safer hospitals, safer businesses, and ultimately, safer nations. Connect with a career that matters, and help us build a safer future. At Motorola Solutions, we're guided by a shared purpose - helping people be their best in the moments that matter - and we live up to our purpose every day by solving for safer. Because people can only be their best when they not only feel safe, but are safe. We're solving for safer by building the best possible technologies across every part of our safety and security ecosystem. That's mission -critical communications devices and networks, AI-powered video security & access control and the ability to unite voice, video and data in a single command center view. We're solving for safer by connecting public safety agencies and enterprises, enabling the collaboration that's critical to connect those in need with those who can help. The work we do here matters. We are seeking a skilled and motivated Technical Lead with a passion for technology and leadership to guide our Site Reliability Engineering (SRE) team in managing NG911 call routing and handling systems. Hosted in public, private, and multi-cloud environments (AWS and Azure), these life-critical systems require achieving and maintaining 99.999% availability.

Requirements

  • Proven track record as a technical leader for an SRE, DevOps, or cloud infrastructure teams in complex environments.
  • Experience with mission-critical systems, ideally in emergency call management (NG911) or public safety solutions.
  • Hands-on experience in designing, analyzing, and troubleshooting large-scale distributed systems.
  • Expertise in public and multi-cloud platforms (AWS and Azure).
  • Familiarity with geographically dispersed, cross-functional team collaboration.
  • Strong knowledge of site reliability engineering principles, including monitoring, alerting, and incident management.
  • Proficiency in automation tools and frameworks (e.g., Terraform, Ansible, Jenkins, GitHub Actions).
  • Experience with distributed systems, predictive monitoring, self-healing mechanisms, and high-availability architectures.
  • Practical knowledge of technologies such as Java (preferred), .Net Core/C#, Angular, PostgreSQL, MS SQL Server, RabbitMQ/Kafka, Redis (preferred).
  • Excellent communication skills for technical and non-technical audiences.
  • Strong problem-solving mindset and a focus on continuous improvement.
  • 8+ years of experience, including 5+ years as a technical leader for SRE, DevOps, or cloud infrastructure teams in complex environments with practical experience with automation tools and frameworks.
  • Legal authorization to work in the U.S. indefinitely is required; employer work permit sponsorship is not available.

Nice To Haves

  • Familiarity with public safety communication standards, such as NENA i3 standards for Next-Generation 911.
  • Knowledge of hybrid cloud architecture and advanced deployment techniques (e.g., canary releases, blue/green deployments, feature flags).
  • Bachelor's degree

Responsibilities

  • Team Leadership & Development: Provide comprehensive technical leadership for the entire SRE team. Drive the technical direction, architectural standards, and implementation of reliability best practices. Mentor and guide the team in advanced technical problem-solving and continuous technical improvement.
  • System Reliability & Availability: Oversee the design and implementation of high-availability (HA) architectures. Ensure systems meet the target of 99.999% availability. Develop and enforce strategies for observability, monitoring, and automated health issue detection.
  • Incident Management & Operations: Lead incident response efforts, including triage, troubleshooting, and communication with stakeholders. Maintain robust incident playbooks and ensure readiness for on-call support. Facilitate Failure Mode and Effects Analysis (FMEA) and Chaos Engineering activities.
  • Collaboration & Communication: Provide the technical overview and direction for all SRE team projects, ensuring consistent architectural excellence and reliability standards across the board. Act as the key technical liaison, clearly articulating the SRE team's technical strategy and reliability status to engineering teams, product management, and executive leadership. Collaborate with development teams to drive technical alignment, promote best practices, and communicate system performance and technical achievements.
  • Continuous Improvement & Automation: Drive automation initiatives to enhance system resilience and reduce manual intervention. Track and report key metrics such as SLOs, error budgets, MTTD, and MTTR. Stay informed about emerging technologies and best practices to continuously improve reliability processes.

Benefits

  • Incentive Bonus Plans
  • Medical, Dental, Vision benefits
  • 401K with Company Match
  • 10 Paid Holidays
  • Generous Paid Time Off Packages
  • Employee Stock Purchase Plan
  • Paid Parental & Family Leave
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service