About The Position

CopeCart is seeking an SRE / DevOps engineer to enhance the speed, safety, and automation of their delivery processes. The initial focus (3-6 months) will be on streamlining deployments for a new system, improving the reliability of the current AWS-based platform, and integrating agentic engineering practices to assist engineers in diagnosing, fixing, and safely shipping changes. This is a hands-on role involving production systems, deployment pipelines, incident management, and collaboration with engineering teams. The role requires comfort with operating production-grade systems, including VPN access, least-privilege permissions, runbooks, service maps, deployment procedures, audit trails, and rollback paths. A key aspect of the role involves utilizing AI agents as supervised engineering assistants for tasks such as code inspection, operational context analysis, suggesting fixes, generating runbooks, improving CI/CD, and debugging production issues safely.

Requirements

  • Strong hands-on experience in SRE, DevOps, platform engineering, infrastructure engineering, or production operations.
  • Production AWS experience.
  • Experience with ECS and containerized services.
  • Experience with GitHub Actions.
  • Experience with Terraform and/or OpenTofu.
  • Experience with CI/CD, Linux, networking, and production debugging.
  • Strong observability skills across metrics, logs, and traces.
  • Ability to write production-quality code or scripts in TypeScript and Bash.
  • Ability to read and modify infrastructure, CI/CD, and application code.
  • Good judgment around production risk, automation, permissions, and rollback.
  • Ability to use AI agents as part of a serious engineering workflow.
  • Ability to break down large operational problems into smaller agent-assisted tasks.
  • Ability to provide agents with the right repo context, runbooks, constraints, and success criteria.
  • Ability to use agents to inspect code, infrastructure, CI/CD, logs, and service behavior.
  • Ability to verify agent output instead of trusting it blindly.
  • Ability to turn useful agent output into production-grade changes.
  • Ability to design workflows where agents suggest fixes, but humans approve risky actions.
  • Knowledge of when not to use an agent.

Nice To Haves

  • Kubernetes experience.
  • Ruby experience.
  • Experience building internal developer platforms or self-service infrastructure.
  • Experience with coding agents, AI-assisted engineering workflows, repo-level agent instructions, evals, or agent guardrails.
  • Experience improving incident response, deploy safety, or on-call quality.

Responsibilities

  • Improve the deployment experience for the new system.
  • Reduce operational bottlenecks that slow down engineering and feature delivery.
  • Strengthen the AWS production setup, currently based on ECS and containers.
  • Improve GitHub Actions CI/CD workflows.
  • Work with Terraform / OpenTofu to make infrastructure safer, clearer, and easier to change.
  • Improve production debugging across AWS, containers, networking, Linux, and application-level issues.
  • Improve observability across metrics, logs, and traces.
  • Create or improve runbooks, repo instructions, service maps, deployment guides, and operational documentation.
  • Introduce agentic engineering workflows to help engineers diagnose issues, propose fixes, and validate changes before production.
  • Design safe guardrails for agent-assisted work, including permissions, approval gates, auditability, sandboxing, rollback procedures, and human review.

Benefits

  • Access to attractive corporate benefits
  • Company pension plan
  • Company fitness through EGYM Wellpass (access to over 6,300 fitness and yoga studios, swimming pools, CrossFit and bouldering gyms across Germany and Austria)
  • Team events (several times a year, in different locations)
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service