Senior Site Reliability Engineer

ZetaChainSan Francisco, CA
7d$140,000 - $190,000Remote

About The Position

We're building something ambitious at ZetaChain: the first universal blockchain and AI platform that connects everything—Bitcoin, Ethereum, Solana, and more—while pioneering in the GenAI space. We're backed by top investors, live on mainnet, and building the future of blockchain and AI technology. If you're excited about working on big, meaningful problems with a world-class team, you're in the right place. We are looking for a Senior Site Reliability Engineer (SRE) to ensure the reliability, scalability, and security of ZetaChain’s production infrastructure. This role is highly hands‑on and execution‑focused. You will operate critical blockchain and AI‑adjacent infrastructure, build automation to reduce operational overhead, and partner closely with protocol, platform, and AI teams to design systems that are reliable by default.

Requirements

  • 4+ years of experience in Site Reliability Engineering, Infrastructure Engineering, or Platform Engineering
  • Strong software engineering background with production experience in Go and/or Python
  • Deep experience operating Linux systems in production
  • Proven experience running Kubernetes at scale
  • Experience supporting high‑availability distributed systems
  • Comfortable working in fast‑moving startup environments
  • Strong security mindset , especially for infrastructure running on public or adversarial networks
  • Excellent collaboration and communication skills
  • Languages: Go, Python, Bash, Terraform, Ansible
  • Infrastructure: Kubernetes, Docker, Linux
  • Observability: Prometheus, Grafana, Datadog, Loki, incident.io
  • Platforms: AWS, GCP, bare metal
  • Blockchain Stack: Cosmos SDK, Tendermint / CometBFT, Ethereum, Bitcoin

Nice To Haves

  • Exposure to AI‑powered infrastructure, observability, or developer tooling
  • Experience operating blockchain nodes or validator infrastructure
  • Familiarity with Cosmos‑based chains or EVM clients
  • Experience with DevOps, DevSecOps, or GitOps methodologies
  • Contributions to open‑source software

Responsibilities

  • Operate and maintain production blockchain infrastructure , including validators, RPC services, indexers, and supporting services
  • Ensure high availability and performance for AI‑enabled developer platforms and internal tooling
  • Build and maintain monitoring, alerting, and dashboards for protocol, infrastructure, and application health
  • Write high‑quality automation and infrastructure code to reduce toil and improve reliability
  • Participate in on‑call rotations , incident response, and post‑incident reviews
  • Partner with engineering teams to embed reliability, scalability, and security best practices into system design
  • Improve Kubernetes reliability across cloud and bare-metal environments
  • Continuously refine deployment, rollback, and recovery strategies

Benefits

  • Make a direct impact on infrastructure powering both blockchain and AI platforms
  • Work on technically challenging, real‑world distributed systems
  • Fully remote with quarterly in-person team meetups
  • Strong open‑source culture and modern engineering practices
  • Competitive compensation and meaningful ownership
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service