Infrastructure and Platform Engineer, Metal

TenstorrentSanta Clara, CA
$100,000 - $500,000Hybrid

About The Position

Tenstorrent is leading the industry on cutting-edge AI technology, revolutionizing performance expectations, ease of use, and cost efficiency. With AI redefining the computing paradigm, solutions must evolve to unify innovations in software models, compilers, platforms, networking, and semiconductors. Our diverse team of technologists have developed a high performance RISC-V CPU from scratch, and share a passion for AI and a deep desire to build the best AI platform possible. We value collaboration, curiosity, and a commitment to solving hard problems. We are growing our team and looking for contributors of all seniorities. Tenstorrent’s AI Software Infrastructure team builds the platforms that power internal development, workload orchestration, and hardware allocation across large-scale AI systems. This role focuses on designing and operating Kubernetes-based platforms on on-prem data centers, enabling engineers and customers to run workloads efficiently on Tenstorrent hardware. This role is hybrid based out of Santa Clara, CA; Austin, TX; or Toronto, ON. We welcome candidates at various experience levels for this role. During the interview process, candidates will be assessed for the appropriate level, and offers will align with that level, which may differ from the one in this posting.

Requirements

  • Experienced backend or infrastructure engineer with a focus on platform development in large-scale environments.
  • Strong expertise in Kubernetes, including cluster provisioning, operators, and production debugging.
  • Proficient in Python or Go for building APIs and platform services.
  • Comfortable working with Linux systems, networking fundamentals, and distributed systems.
  • Collaborative and adaptable, able to work across engineering, infrastructure, and deployment teams.

Responsibilities

  • Design and build platform services for workload orchestration, ML services, and internal development workflows.
  • Develop APIs and systems that enable users and services to interact with infrastructure platforms.
  • Own Kubernetes-based platforms including cluster lifecycle, scaling, and operational maturity.
  • Integrate platform systems with CI/CD pipelines, GitOps workflows, and internal tooling.
  • Partner with SRE, infrastructure, and deployment teams to support large-scale internal and external environments.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Entry Level

Education Level

No Education Listed

Number of Employees

101-250 employees

© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service