Senior Cloud DevOps Engineer (Azure & AWS)

Michael Baker InternationalPittsburgh, PA
4h$130,000 - $170,000

About The Position

As a Senior Cloud DevOps Engineer at MBI, you will take a hands-on role in designing, building, and maintaining our cloud infrastructure across Microsoft Azure and Amazon Web Services. Reporting directly to the CTO, you will work in close partnership with the VP of Product to ensure cloud platforms align with product delivery timelines and feature roadmaps, with the VP of Infrastructure to maintain operational excellence and reliability standards, and with the CISO to enforce security best practices and regulatory compliance across all environments. You will be responsible for ensuring fast, secure, and reliable software delivery through automation and DevOps best practices. This senior position is a hands-on role, meaning you will actively implement solutions—tackling everything from infrastructure provisioning and CI/CD pipeline creation to incident response and cloud vendor partnership management. Your work will directly support MBI’s strategic technology goals, including Vision 2030 initiatives, digital platform delivery, and enterprise-wide AI/ML infrastructure enablement.

Requirements

  • Extensive Cloud Expertise: 5+ years of hands-on experience designing and supporting Azure and AWS cloud infrastructure in production environments. Deep understanding of core services on both platforms and experience with hybrid or multi-cloud strategies.
  • Automation & DevOps Tools: Strong proficiency with CI/CD pipelines (Jenkins, Azure DevOps, GitLab CI, GitHub Actions), Infrastructure as Code (Terraform, CloudFormation, ARM/Bicep), and configuration management tools (Ansible, Chef, PowerShell DSC).
  • Infrastructure as Code Best Practices: Deep understanding and demonstrated experience implementing DevOps infrastructure as code best practices, including modular template design, state management, drift detection, policy-as-code, and environment promotion strategies.
  • Programming/Scripting: Solid scripting and coding abilities in Python, Bash, and PowerShell for automating tasks, integrating systems, and building DevOps workflows.
  • Cloud Resource Management: Proven experience managing cloud resources at enterprise scale, including resource tagging strategies, governance frameworks, budget controls, subscription/account structures, and FinOps practices.
  • Monitoring & Troubleshooting: Experience with monitoring/alerting frameworks (Azure Monitor, CloudWatch, Prometheus, Elastic stack, Grafana) and incident management processes. Strong ability to diagnose and resolve complex issues across distributed systems.
  • Security & Compliance Knowledge: Strong understanding of cloud security practices—IAM, network segmentation, encryption, zero-trust architecture, and compliance requirements. Experience implementing security controls and responding to audits and incidents. Familiarity with DevSecOps concepts and tools.
  • Collaboration & Communication: Excellent communication skills and a collaborative mindset. Able to work with cross-functional teams (Dev, IT Ops, Security, Product, and external vendors) and articulate cloud concepts to both technical and non-technical stakeholders. Experience in vendor management or partnership coordination is highly valued.
  • Education: Bachelor’s degree in Computer Science, Engineering, or related field preferred. Equivalent practical experience is also highly valued.

Nice To Haves

  • Architecture Experience: Experience as an Enterprise Architect or in an enterprise architecture capacity, with the ability to evaluate technology decisions in the context of broader organizational strategy, integration patterns, and long-term platform evolution.
  • AI/ML Infrastructure & Platform Engineering: Experience building and managing cloud infrastructure for AI/ML workloads, including GPU-accelerated compute, vector databases, model serving pipelines, and data lake architectures.
  • Container Orchestration & Microservices: Hands-on experience with containerization (Docker) and orchestration platforms (Kubernetes, ECS, AKS) in production, including service mesh, ingress management, and cluster operations.
  • Platform Engineering & Developer Experience: Experience implementing platform engineering practices—building internal developer platforms (IDPs), self-service infrastructure provisioning, golden paths, and developer experience tooling.
  • Data Engineering & Integration: Familiarity with cloud-native data services, data pipelines (e.g., Azure Data Factory, AWS Glue), and integration with analytics/BI platforms.
  • Domain-Specific Cloud Experience (AEC/GovTech): Experience with digital twin platforms, geospatial systems (Esri, Bentley, Autodesk), or engineering-specific cloud workloads relevant to the AEC industry.
  • Cloud Governance & Standards Leadership: Participation in cloud center of excellence (CCoE) programs, cloud governance board activities, or technology standards committees.
  • Professional certifications (AWS/ Azure/ Hashicorp) will strengthen a candidate’s profile.

Responsibilities

  • Cloud Infrastructure Management & Automation
  • Multi-Cloud Environment Oversight: Design, implement, and manage cloud infrastructure on Microsoft Azure and AWS, covering compute, storage, networking, and platform services. Ensure these environments are configured for high availability, optimal performance, and resiliency. Coordinate with the VP of Infrastructure on architecture standards, capacity planning, and operational readiness.
  • CI/CD Pipeline Implementation: Develop and maintain robust CI/CD pipelines to automate software build, test, and deployment processes. Use tools such as GitHub Actions, Jenkins, Azure DevOps, or AWS CodePipeline to enable frequent and reliable releases. Collaborate with the VP of Product to align deployment cadences with product release schedules and feature rollout priorities.
  • Infrastructure as Code & Automation: Employ Infrastructure as Code (IaC) tools (e.g., Terraform, Azure Resource Manager/Bicep templates, AWS CloudFormation) to automate provisioning and configuration of cloud resources. Maintain version-controlled scripts for repeatable environment setups, and automate routine tasks (using PowerShell, Bash, or Python scripting) to improve efficiency and consistency.
  • Cloud Operations Best Practices (Scalability, Reliability & Cost Optimization)
  • Scalability & Reliability: Ensure all cloud architectures follow best practices for scalability and resilience. Implement auto-scaling groups, load balancers, and clustering to handle changes in demand. Design systems with fault tolerance (multi-AZ/multi-region deployments, backups, and disaster recovery plans) so that uptime and performance meet enterprise SLAs. Work with the VP of Infrastructure to validate capacity and resilience through regular system tests.
  • Performance Optimization: Continuously monitor and analyze system performance metrics to identify bottlenecks or inefficiencies. Tune applications and infrastructure (right-sizing instances, optimizing database performance, leveraging CDNs/caching) to improve response times and throughput.
  • Cost Management: Take ownership of cloud cost optimization. Use financial management tools (AWS Cost Explorer, Azure Cost Management) to track usage and spending. Identify opportunities to reduce costs—rightsizing resources, eliminating underutilized assets, and leveraging pricing models (reserved instances, savings plans). Implement governance for resource usage and educate teams on cost-aware development.
  • Security & Compliance
  • Work directly with the CISO and security team to ensure all cloud environments meet MBI’s security requirements and align with the organization’s cybersecurity framework.
  • Identity & Access Management: Implement robust security controls across Azure and AWS. Manage IAM roles, policies, and Azure Active Directory/Entra ID integrations to enforce principle of least privilege. Set up SSO and MFA where needed, and regularly review access logs and permissions.
  • Cloud Security Best Practices: Configure network security groups, firewalls, and encryption mechanisms to safeguard data and services. Use Azure Security Center/Defender and AWS Security Hub to continually assess security posture. Ensure data at rest and in transit is encrypted using Azure Key Vault, AWS KMS, and appropriate network controls.
  • Compliance & Vulnerability Management: Ensure cloud environments comply with relevant standards and regulations (SOC 2, ISO 27001, CMMI, CMMC, FedRAMP) and MBI’s internal policies. Conduct regular security audits, vulnerability assessments, and penetration testing coordination. Apply patches promptly using automated tools (AWS Systems Manager, Azure Automation). Establish and test incident response procedures in coordination with the CISO.
  • Support, Monitoring & Incident Response
  • Performance Monitoring: Set up comprehensive monitoring and alerting for all cloud systems and applications using Azure Monitor, AWS CloudWatch, and third-party APM tools (Datadog, New Relic, Grafana/Prometheus). Create dashboards and automated alerts for critical conditions.
  • Troubleshooting & Incident Management: Act as a primary responder to cloud-related incidents and outages. Investigate and troubleshoot infrastructure and deployment issues across all environments. Analyze logs, diagnose root causes, and restore service rapidly. Implement rollback or hot-fix strategies to minimize downtime. Conduct post-incident reviews and drive continuous improvement.
  • Service Reliability & Availability: Proactively implement measures to improve service uptime through capacity planning, chaos testing, and disaster recovery drills. Apply SRE principles—define SLIs/SLOs for critical services and ensure operations meet targets. Create and maintain runbooks and SOPs for common issues.
  • User Support & Collaboration: Collaborate with development teams, IT support, and the VP of Product’s organization to resolve cloud-related issues affecting application functionality or user experience. Provide guidance to developers on effective use of cloud dev/test environments. Communicate incident status and resolutions clearly to stakeholders.
  • Strategic Collaboration & Cloud Provider Partnerships
  • Vendor Partnership Management: Serve as MBI’s technical liaison with Microsoft Azure and AWS. Build and maintain strong relationships with cloud partner teams. Act as the primary point of contact for cloud providers within the organization, streamlining vendor support engagement.
  • Roadmap Alignment & Joint Initiatives: Work with Azure/AWS solution architects and account managers to stay informed about upcoming services and best practice guidelines. Align MBI’s cloud roadmap with provider offerings. Lead joint architecture reviews, well-architected framework assessments, and pilot programs for new services that support MBI’s strategic objectives.
  • Escalation & Support Coordination: Manage escalation processes with Microsoft and AWS support teams for critical issues. Coordinate high-priority support tickets, provide diagnostic information, and advocate for MBI’s needs. Negotiate and track enterprise support credits, proof-of-concept funding, and sponsorship programs.

Benefits

  • Medical, dental, vision insurance
  • 401 (k) Retirement Plan
  • Health Savings Account (HSA)
  • Flexible Spending Account (FSA)
  • Life, AD&D, short-term, and long-term disability
  • Professional and personal development
  • Generous paid time off
  • Commuter and wellness benefits
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service