Launch Legends-posted 20 days ago
Part-time • Mid Level
Cheyenne, WY

Autheo is building a resilient DePIN platform guaranteeing 99.999% uptime across 1,000+ global nodes processing millions TPS, 200GB/s storage, and AI inference with unbreakable security and automated GDPR/HIPAA compliance during zero-downtime upgrades. As a part-time Senior Site Reliability Engineer in an equity-based cofounder role, you’ll maintain infrastructure resilience, implement chaos engineering, and optimize for high-availability blockchain operations. This role is critical to ensuring 1B+ TPS, 200GB/s DePIN flows, and AI/ML workloads in a decentralized environment. If you’re passionate about reliability and optimization, join us to fortify the backbone of the next trillion-dollar decentralized economy.

  • Infrastructure Resilience
  • Maintain 99.999% uptime across 1,000+ global nodes with automated failover and zero-downtime upgrades.
  • Implement chaos engineering to test resilience against failures in blockchain/DePIN operations.
  • Optimize for 1B+ TPS and 200GB/s storage with proactive capacity planning.
  • Monitoring & Observability
  • Deploy Prometheus/Grafana for real-time monitoring of blockchain anomalies and DePIN performance.
  • Integrate OpenTelemetry for distributed tracing with <15min MTTR.
  • Build ML-powered alerting for threat detection and resource imbalances.
  • Compliance & Security
  • Embed GDPR/HIPAA-compliant monitoring with automated audit logging.
  • Implement zero-trust security for DePIN networks and AI inference pipelines.
  • Design disaster recovery plans for blockchain/DeFi incidents with 95% success rate.
  • Automation & Optimization
  • Automate infrastructure provisioning and scaling with Terraform/Ansible.
  • Optimize Kubernetes for blockchain node operations and AI workloads.
  • Conduct post-mortems and SLO/SLI improvements for continuous reliability.
  • Collaboration & Innovation
  • Collaborate with DePIN, blockchain, and AI/ML teams for integrated reliability.
  • Lead SRE reviews for scalability and compliance.
  • Mentor engineers and contribute to open-source SRE tools.
  • Publish at SREcon/Web3 Summit on reliability innovations.
  • Bachelor’s/Master’s in Computer Science or equivalent.
  • 5+ years in SRE for high-availability systems (99.999% uptime).
  • Expertise in Kubernetes, Prometheus, Grafana, OpenTelemetry.
  • Proficiency in chaos engineering, Terraform/Ansible, and compliance auditing.
  • Background in blockchain/DePIN operations or AI infrastructure.
  • Experience with open-source SRE tools or multi-cloud environments.
  • Contributions to SRE standards or patents in reliability engineering.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service