Senior Manager, Platform Services- Weights & Biases

Weights & BiasesSunnyvale, CA
5d$188,000 - $275,000Hybrid

About The Position

CoreWeave, the AI Hyperscaler™, acquired Weights & Biases to create the most powerful end-to-end platform to develop, deploy, and iterate AI faster. Since 2017, CoreWeave has operated a growing footprint of data centers covering every region of the US and across Europe, and was ranked as one of the TIME100 most influential companies of 2024. By bringing together CoreWeave’s industry-leading cloud infrastructure with the best-in-class tools AI practitioners know and love from Weights & Biases, we’re setting a new standard for how AI is built, trained, and scaled. The integration of our teams and technologies is accelerating our shared mission: to empower developers with the tools and infrastructure they need to push the boundaries of what AI can do. From experiment tracking and model optimization to high-performance training clusters, agent building, and inference at scale, we’re combining forces to serve the full AI lifecycle — all in one seamless platform. Weights & Biases has long been trusted by over 1,500 organizations — including AstraZeneca, Canva, Cohere, OpenAI, Meta, Snowflake, Square,Toyota, and Wayve — to build better models, AI agents and applications. Now, as part of CoreWeave, that impact is amplified across a broader ecosystem of AI innovators, researchers, and enterprises. As we unite under one vision, we’re looking for bold thinkers and agile builders who are excited to shape the future of AI alongside us. If you're passionate about solving complex problems at the intersection of software, hardware, and AI, there's never been a more exciting time to join our team. You will be a primary driver of platform-wide performance and scaling initiatives. This is a high-visibility leadership role responsible for the technical foundation of Weights & Biases. You will manage the critical path between model training and data visualization, ensuring the platform scales to meet the demands of the world's largest AI labs. Your mission is to drive a performance-first culture across the organization. You will be expected to move beyond your immediate reports, exercising cross-organizational influence to ensure that every architectural decision—from SDK telemetry capture to backend API design—prioritizes system efficiency and "time-to-glass" for the user.

Requirements

  • Experienced Leader: 10+ years of software engineering experience, with 3+ years in a management role leading teams focused on high scale systems.
  • Technical Depth: You have a deep understanding of backend systems, API design (REST/GraphQL/gRPC), and high-throughput data ingestion.
  • Performance-Obsessed: You understand the nuances of system bottlenecks (CPU, memory, I/O) and have experience driving performance improvements that impact end-user experience.
  • Architectural Visionary: You can navigate complex technical trade-offs, ensuring that today's solutions don't become tomorrow's technical debt.
  • Strategic Communicator: You excel at translating complex technical roadmaps into business value for stakeholders and can influence engineering standards across the entire organization.

Nice To Haves

  • High-Scale Systems Experience: Proven track record of managing platforms that handle multi-petabyte datasets or ingestion rates exceeding millions of events per second.
  • Deep Observability Expertise: Hands-on experience with profiling and tracing tools (e.g., pprof, eBPF, OpenTelemetry) to diagnose performance bottlenecks across distributed service boundaries.
  • Architectural Influence: Past success in a "Player-Coach" or "Architect-Manager" capacity, where you successfully convinced multiple teams to adopt new performance patterns or infrastructure standards.

Responsibilities

  • Own the Scaling Strategy: Architect and execute a roadmap that balances high-velocity data ingestion with high-performance visualization. You will ensure the platform remains stable as we scale to support thousands of concurrent ML runs and billions of data points.
  • Drive Cross-Functional Performance: Act as a performance "ambassador" across the organization. You will work across team and department boundaries to identify bottlenecks in the data path, from the SDK level through the API to the final chart render.
  • Evolve Core Infrastructure: Lead the team responsible for W&B’s shared backend logic and internal tooling. You will own the schema definitions and code-generation workflows that ensure consistency and type-safety across all microservices.
  • Optimize the SDK Experience: Oversee the development of our client-side libraries to ensure they capture rich GPU telemetry and metrics with zero-to-minimal impact on the performance of the user’s training experiment.
  • Mentor and Grow Engineers: Manage and coach a high-caliber team of backend and systems engineers. You will foster a culture of operational excellence, emphasizing that "features are not complete until they are performant."

Benefits

  • Medical, dental, and vision insurance - 100% paid for by CoreWeave
  • Company-paid Life Insurance
  • Voluntary supplemental life insurance
  • Short and long-term disability insurance
  • Flexible Spending Account
  • Health Savings Account
  • Tuition Reimbursement
  • Ability to Participate in Employee Stock Purchase Program (ESPP)
  • Mental Wellness Benefits through Spring Health
  • Family-Forming support provided by Carrot
  • Paid Parental Leave
  • Flexible, full-service childcare support with Kinside
  • 401(k) with a generous employer match
  • Flexible PTO
  • Catered lunch each day in our office and data center locations
  • A casual work environment
  • A work culture focused on innovative disruption

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Mid Level

Education Level

No Education Listed

Number of Employees

101-250 employees

© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service