About The Position

As the DevOps Team Lead for our Core Foundation pod, you will lead the "engine room" of Alpaca’s most critical infrastructure initiatives. You will manage a talented, globally distributed team of engineers (spanning APAC, EMEA, and AMER) responsible for Heavy Compute, Core Networking, Stateful Data, Observability, and Cloud/Physical Infrastructure. This is a leadership and operations-first role. While you need a solid technical background to guide the team and understand our stack, we are not looking for an individual contributor to be in the weeds of the tech. Instead, we need a masterful planner, communicator, and people manager. You will establish robust support frameworks, orchestrate complex rollouts, manage incident response, and ensure the seamless delivery of our highest-priority yearly infrastructure and architectural evolution initiatives.

Requirements

  • Proven experience as an Engineering Manager, DevOps Lead, or Site Reliability Engineering Lead, with a track record of successfully managing globally distributed teams.
  • Exceptional people management skills, with a deep focus on coaching, mentoring, and fostering team culture across multiple time zones.
  • Deep expertise in engineering support frameworks, roadmap planning, and team prioritization methodologies.
  • Proven experience owning Change Management lifecycles. You have a unifying leadership style with a proven ability to break down organizational silos, build trust between disparate teams, and shepherd complex systemic updates from conception to deployment.
  • Extensive experience managing Incident Management lifecycles and running sustainable, global on-call rotations.
  • Incredibly strong communication and organizational skills, with a proven ability to drive and coordinate complex, multi-stage tech rollouts and deployments.
  • A solid technical background in modern DevOps/SRE ecosystems. You don't need to be hands-on daily, but you must fluently understand the concepts and operational realities surrounding Kubernetes (GKE), Infrastructure as Code (Terraform), Relational Databases (PostgreSQL), and Observability stacks (Prometheus, Grafana, Thanos).
  • A strategic mindset capable of navigating shifting priorities, acting as the steady organizational force for the company's core infrastructure foundation.

Responsibilities

  • People & Tech Leadership: Lead, mentor, and foster a healthy, high-performing globally distributed engineering team.
  • Prioritization & Planning: Own the execution and delivery of highly critical, complex yearly roadmap items centered around large-scale foundational infrastructure upgrades, high availability, and platform resilience.
  • Change Management Ownership: Own and drive the change management processes across engineering and product domains. You will orchestrate the smooth delivery of major systemic changes, ensuring alignment, mitigating friction, and breaking down silos between diverse technical groups to deliver cohesive infrastructure solutions.
  • Support Frameworks & Methodologies: Design, implement, and refine robust support workflows, agile planning methodologies, and deployment/rollout strategies to ensure operational excellence.
  • On-Call & Incident Management: Manage and optimize the global on-call rotation to ensure team well-being while maintaining high availability. Lead incident response (via Rootly), establishing clear communication, rapid resolution processes, and blameless post-mortems.

Benefits

  • Competitive Salary & Stock Options
  • Health Benefits
  • New Hire Home-Office Setup: One-time USD $500
  • Monthly Stipend: USD $150 per month via a Brex Card
  • Alpaca is proud to be an equal opportunity workplace dedicated to pursuing and hiring a diverse workforce.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service