About The Position

At Amtrak, the Principal DevOps Engineer is a Principal technical leader responsible for ensuring the resilience, scalability, and security of our digital platforms. This role combines software engineering, systems engineering, and a deep operational mindset to improve reliability through automation, observability, and proactive incident response. The successful candidate will drive architectural decisions around SLOs, error budgets, infrastructure as code, and deployment strategies while mentoring engineers and standardizing practices across teams. They will collaborate cross-functionally to implement scalable solutions that align with our goals for service health, security, and development velocity.

Requirements

  • Bachelor’s degree in Computer Science, Engineering, or related technical discipline.
  • At least 5 years of experience in DevOps, SRE, or Platform Engineering roles with leadership experience in automation and infrastructure reliability.
  • 3+ years hands-on experience in high-availability production environments with cloud-native security and observability tooling.
  • Deep expertise in AWS (or equivalent cloud platform), especially in compute, networking, IAM, and monitoring.
  • Proficiency in Terraform, AWS CDK, CloudFormation, Docker, and Linux systems.
  • Experience with ‘pipelines as code’ and setting up CI/CD with Github Actions, AWS CodeBuild/CodePipelines, Jenkins automation.
  • Experience implementing and managing CI/CD systems with security tollgates and rollback logic.
  • Strong scripting skills in Python, Go, or Bash for automation and tooling.
  • In-depth understanding of SRE practices including incident response, SLO/SLA management, chaos engineering, and capacity modeling.
  • Familiarity with Git and GitOps patterns.
  • Proven track record of creating shared tooling and documentation that promotes operational excellence.

Nice To Haves

  • Master’s degree in Computer Science or equivalent.
  • Certifications: AWS DevOps Engineer Pro, Terraform Associate, CKA, or SRE-focused credentials.
  • Experience with developer portals (e.g., Backstage), service mesh (e.g., Istio), and security tooling (e.g., Vault, Falco, Trivy).
  • Knowledge of DORA metrics, reliability KPIs, and engineering effectiveness measurement frameworks.
  • Background in regulated environments (e.g., PCI, HIPAA, FedRAMP) with experience implementing security automation at scale.

Responsibilities

  • Architect progressive delivery (canary/blue-green/feature flags) of DevSecOps CI/CD pipelines
  • Automate rollback/fail-forward and release evidence capture.
  • Standardize quality gates (tests, perf/chaos pre-prod).
  • Publish hardened base images and golden IaC modules with guardrails.
  • Enforce k8s/RBAC, network policies, quotas; secret standards.
  • Design multi-env promotion workflows with policy checks.
  • Establish SLOs/error budgets; drive cross-team reliability improvements.
  • Bake runbooks into alerts; add synthetic/load tests to pipelines.
  • Lead major incidents; land systemic fixes (not just patches).
  • Enforce short-lived creds, zero-trust patterns, and attestation/signing.
  • Automate compliance checks and evidence collection.
  • Partner with security on threat-modeling for platform changes.
  • Create internal libraries/CLIs with telemetry and docs.
  • Measure automation ROI (time saved, error-rate drop).
  • Orchestrate complex workflows (e.g., Step Functions/Argo Workflows).
  • Own a platform capability end-to-end (roadmap, SLAs, upgrades).
  • Drive adoption of best practices across multiple teams.
  • Write ADRs and decision logs that clarify trade-offs.
  • Define/validate RPO/RTO; automate restore drills and reports.
  • Tune critical paths for latency/throughput and cost.
  • Forecast impacts of migrations; deliver measurable cost/perf wins.

Benefits

  • health, dental, and vision plans
  • health savings accounts
  • wellness programs
  • flexible spending accounts
  • 401K retirement plan with employer match
  • life insurance
  • short and long term disability insurance
  • paid time off
  • back-up care
  • adoption assistance
  • surrogacy assistance
  • reimbursement of education expenses
  • Public Service Loan Forgiveness eligibility
  • Railroad Retirement sickness and retirement benefits
  • rail pass privileges
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service