DevOps Engineering Manager, SRE and Cloud Service

IntegriChain•Philadelphia, PA

16d

About The Position

The DevOps Engineering Manager, SRE and Cloud Services is responsible for leading the teams that ensure the reliability, scalability, and performance of IntegriChain’s cloud platforms and production systems. This role manages DevOps and Site Reliability Engineering functions, with a strong focus on cloud infrastructure, automation, and operational excellence. You will work closely with application engineering, platform, security, and IT teams to support product delivery while maintaining high standards for availability, resilience, and security. This role balances people leadership with hands-on technical engagement and is critical to the success of our SaaS platforms in a healthcare and life sciences environment. Your day typically starts with connecting to the team through daily standups or operational check-ins. You review system health, active work, incidents, and priorities, making sure the team is focused on what matters most and that risks are addressed early. You stay close to production systems through dashboards, alerts, and direct conversations with engineers. Throughout the day, you work directly with DevOps, SRE, and application engineering teams to remove roadblocks and keep work moving forward. This may involve helping troubleshoot issues, guiding technical decisions, or coordinating across teams to resolve dependencies. You are regularly involved in design and architecture discussions, helping teams think through reliability, scalability, performance, and operational readiness. Because the team operates across multiple time zones, you spend time coordinating work and maintaining clear communication across regions. You help establish shared processes, clear handoffs, and consistent expectations so work continues smoothly around the clock. When incidents or operational challenges arise, you support response efforts, help coordinate resolution, and ensure follow-up actions are completed. Over time, you help turn recurring issues into lasting improvements by strengthening automation, cloud practices, and reliability standards.

Requirements

7 or more years of experience in DevOps, SRE, or cloud engineering roles.
3 or more years of experience leading or managing technical teams.
Strong hands-on experience with cloud platforms such as AWS, Azure, or GCP.
Experience with CI/CD pipelines, infrastructure as code, monitoring, and incident management.
Solid understanding of reliability, scalability, and operational best practices.
Strong communication and collaboration skills.
Experience supporting SaaS platforms in regulated or compliance-driven environments.

Nice To Haves

Familiarity with SRE concepts such as SLIs, SLOs, and error budgets.
Experience working with globally distributed teams.
Background in healthcare or life sciences technology.

Responsibilities

Lead and develop a team of DevOps and SRE engineers supporting cloud infrastructure and production systems.
Set clear priorities, goals, and expectations for reliability, performance, and operational readiness.
Foster a culture of ownership, continuous improvement, and learning across the team.
Oversee cloud infrastructure across environments, ensuring scalability, resilience, and cost efficiency.
Drive adoption of infrastructure as code, automation, and standardized tooling.
Partner with engineering teams to support platform needs and production deployments.
Establish and improve SRE practices, including monitoring, alerting, incident response, and post-incident reviews.
Lead efforts to reduce operational toil and improve system reliability through automation and process improvements.
Support release management, change control, and operational governance.
Participate in design and architecture discussions to ensure systems are built for reliability, scalability, and operability.
Review and guide implementation approaches related to CI/CD, cloud services, and platform architecture.
Advocate for operational best practices early in the development lifecycle.
Work closely with Product, Engineering, Security, and IT teams to align operational priorities with business needs.
Communicate clearly with stakeholders on system health, risks, and improvement initiatives.
Support compliance and security requirements relevant to healthcare and life sciences technology platforms.

Benefits

Excellent and affordable medical benefits + non-medical perks including Flexible Paid Time Off and much more!
Robust Learning & Development opportunities including over 700+ development courses free to all employees

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume