About The Position

At Apple, the information powering Siri, Spotlight, Apple Maps, and Apple Intelligence doesn't appear by magic — it's harvested at massive scale from the live web by a distributed crawl platform you'll help build and operate. You'll join a small, high-impact team responsible for a system that continuously fetches, renders, and extracts structured knowledge from billions of web pages — feeding the intelligence layer behind Apple's most-used products. The web crawl infrastructure you'll work on is a large-scale, distributed system that fetches, renders, and extracts structured knowledge from billions of web pages — powering Siri, Apple Intelligence, Spotlight, Apple Maps, and more. We're looking for an engineer who doesn't just build distributed systems but who leverages modern AI coding tools as a core part of their daily engineering workflow to move faster, write higher-quality code, and tackle more ambitious problems than traditional development cycles allow. This is a deeply technical role embedded in a production platform with strict latency SLOs, complex failure domains, and high operational stakes. You'll work across the full crawl lifecycle, from how requests are scheduled and dispatched, to how pages are fetched and rendered, to how structured data is extracted and delivered downstream. As a member of this team, you will own the critical components of the systems and will be responsible for shaping the roadmap.

Requirements

  • BS or MS in Computer Science or equivalent experience
  • Strong understanding of data structures and algorithms
  • Strong systems programming background — Rust, Scala, or Go
  • Experience building scalable distributed systems at the scale of billion ops
  • Solid understanding of async programming models, queue-based architectures, and at-least-once / exactly-once delivery semantics
  • Deep expertise of Cloud infrastructure deployments, managing workloads across heterogeneous clusters (EKS and/or bare-metal)
  • Hands-on Kubernetes experience — multi-cluster, resource tuning, HPA, rolling deployment
  • Hands-on AWS experience: SQS, S3, MSK (Kafka), EKS, IAM, VPC, Transit Gateways
  • Familiarity with Kafka — topic management, consumer group lag, partition rebalancing
  • Experience defining metrics, writing alert rules, and building dashboards for distributed services
  • Excellent interpersonal skills able to work independently as well as cross-functionally

Nice To Haves

  • Experience in Web Crawl is a plus
  • Headless browser infrastructure at scale is a plus
  • Flink or Spark streaming/batch pipelines over Iceberg is a plus
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service