About The Position

BridgePhase is seeking an Infrastructure Operations Engineer to join our team supporting the Department of Homeland Security (DHS). This role provides Tier 3 operational support across both AWS cloud infrastructure and an enterprise Drupal-based information sharing platform, combining deep infrastructure operations expertise with hands-on application and platform troubleshooting. The ideal candidate thrives in production environments, excels at incident response, and partners closely with development, DevSecOps, and security teams to maintain system stability, performance, and compliance. This is a remote position. As with any technical environment, the exact role responsibilities will evolve with the changing needs of our client. We are seeking versatile candidates who thrive on new challenges and can readily adapt to additional responsibilities beyond those listed above.

Requirements

  • At least 8 years of total professional experience with 5+ years in infrastructure operations, cloud engineering, or production support roles
  • Prior or current experience supporting government programs (GovCon experience required)
  • Strong technical knowledge and expertise in: AWS core services (EC2, S3, RDS, VPC, ELB/ALB, CloudFront, Route53) Cloud security services (IAM, Security Groups, KMS, CloudTrail, GuardDuty) Infrastructure monitoring and observability (CloudWatch, Datadog, New Relic, or similar) Infrastructure as Code (Terraform, CloudFormation, Ansible) CI/CD pipeline operations (Jenkins, GitLab CI, AWS CodePipeline) Linux/Unix system administration and command-line tools Networking concepts (VPCs, subnets, routing, VPNs, DNS) Log aggregation and analysis (CloudWatch Logs, ELK stack, Splunk) Container technologies (Docker, ECS, EKS, Kubernetes)
  • Demonstrated ability to: Production incident management and escalation Troubleshoot complex issues under pressure in live environments Perform root cause analysis and implement long-term fixes Support security incident response and vulnerability remediation Execute change management and configuration control Maintain clear, accurate technical documentation Work within ITIL or similar service management frameworks
  • Working knowledge and familiarity with: Federal security and compliance requirements (FedRAMP, FISMA, NIST) DevSecOps practices and automation tooling Backup, recovery, and disaster recovery procedures Web application architecture and performance optimization Database operations, backup/restore, and performance tuning Agile development and operations methodologies SLA management, KPIs, and operational reporting

Nice To Haves

  • Hands-on Drupal experience, including operational support, troubleshooting, or performance optimization
  • AWS infrastructure engineering experience beyond operations (design, modernization, or large-scale cloud migrations)
  • Experience supporting enterprise-scale or mission-critical information sharing platforms
  • AWS certifications (Solutions Architect, SysOps Administrator, Security Specialty)

Responsibilities

  • Provide Tier 3 support for complex infrastructure and application-related incidents
  • Monitor system health, performance metrics, application logs, and infrastructure telemetry
  • Troubleshoot and resolve production issues across AWS infrastructure and Drupal-based platforms
  • Support AWS cloud services including compute, storage, networking, and security components
  • Investigate and diagnose performance bottlenecks, resource constraints, and configuration issues
  • Support CI/CD pipeline operations and troubleshoot deployment or release failures
  • Perform root cause analysis for recurring incidents and implement preventive measures
  • Coordinate incident response and resolution with development, DevSecOps, security, and infrastructure teams
  • Execute routine maintenance tasks including patching, scaling, backups, and system updates
  • Support deployment activities and release verification in production environments
  • Manage user support tickets and ensure timely resolution within SLA requirements
  • Maintain and update technical documentation for operational procedures and known issues
  • Implement and maintain monitoring alerts, logging, and automated health checks
  • Support disaster recovery testing and business continuity planning
  • Ensure compliance with federal security requirements and audit controls
  • Interface with federal stakeholders on operational status, issue escalation, and resolution
  • Collaborate with AWS support and third-party vendors for escalated technical issues

Benefits

  • Competitive compensation that reflects your skills and impact
  • Multiple bonus programs rewarding performance, company growth, and employee referrals
  • Flexible PTO with 20 days to use when you need them
  • All federal holidays paid to help you truly recharge
  • Paid sick leave because health always comes first
  • 100% paid parental leave
  • 401(k) with 6% match and no vesting period
  • Top-tier medical, dental, and vision plans with low out-of-pocket costs
  • Short- and long-term disability and life insurance included
  • Pet insurance to support your four-legged family
  • Annual professional development budget for training, certifications, and conferences
  • Two paid community service days for causes that matter to you
  • Social pod budget to connect with teammates wherever you live

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Mid Level

Education Level

No Education Listed

Number of Employees

51-100 employees

© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service