AI Engineer- Gen AI/SWE- Weights & Biases

CoreWeaveSunnyvale, CA
5h$188,000 - $275,000Hybrid

About The Position

The AI team is a hands-on applied AI group at Weights & Biases that turns frontier research into teachable workflows. We collaborate with leading enterprises and the OSS community. We are the team that took W&B from a few hundred users to millions of users and one of the most beloved tools in the ML community. A senior applied role at the research-to-product boundary. You will design, implement, and evaluate LLM applications and agents with cutting-edge techniques from the latest research, then document and teach them to our community and customers. The focus is application, not novel research: rapid prototyping, careful evaluation, and production-grade reference implementations with clear trade-offs. We prioritize responsible, safe deployment and reproducibility. About the role: Ship end-to-end GenAI workflows (prompting → RAG → tools/agents → eval → serve) with reproducible repos, W&B Reports, and dashboards others can run. Build agentic systems (tool use, function calling, multi-step planners) with MCP servers/clients and secure tool/resource integrations. Design evaluation harnesses (RAG/agent evals, golden sets, regression tests, telemetry) and drive continuous improvement via offline + online metrics. Build in public: Publish engineering artifacts (code, docs, talks, tutorials) and engage with OSS and customer engineers; turn repeated patterns into reusable templates. Partner with product/solutions to launch LLM-powered features with clear latency/cost/SLO targets and safety/guardrail checks. Run growth experiments to track the usage of the Weights & Biases suite of products from the artifacts built.

Requirements

  • Software engineering: 6+ years building production systems; strong Python or TypeScript + system design, testing, CI/CD, observability.
  • GenAI apps: shipped LLM-powered features (tools/agents/function calling), with measurable impact (latency/cost/reliability).
  • Agentic patterns: implemented planners/executors, tool orchestration, sandboxing, and failure taxonomies; familiarity with agent infra concerns.
  • RAG: pragmatic mastery of chunking, embeddings, vector/hybrid search, rerankers; experience with vector DBs/search indices and retrieval policy design.
  • Evaluation: designed LLM/RAG/agent evals (offline golden sets, counterfactuals, user studies, guardrail tests); stats literacy (variance, CIs, power).
  • Serving & productization: comfortable with queueing, caching, streaming, and cost controls; can debug latency at model, retrieval, and network layers.
  • Public signal: 2+ substantial OSS repos/blog posts/talks/videos with adoption (stars, forks, downloads, views) and reproducible artifacts.

Nice To Haves

  • Experience building with AI SDKs / agent frameworks (e.g., TypeScript/Python SDKs, planning libraries) and shipping developer-facing examples.
  • Production agent security/sandboxing, red-teaming, and policy/PII enforcement.
  • Operated eval platforms or built judge models/heuristics; experience leading metrics reviews with product/UX.
  • Customer-facing enablement: templates or reference implementations adopted by external teams at scale.

Responsibilities

  • Ship end-to-end GenAI workflows (prompting → RAG → tools/agents → eval → serve) with reproducible repos, W&B Reports, and dashboards others can run.
  • Build agentic systems (tool use, function calling, multi-step planners) with MCP servers/clients and secure tool/resource integrations.
  • Design evaluation harnesses (RAG/agent evals, golden sets, regression tests, telemetry) and drive continuous improvement via offline + online metrics.
  • Build in public: Publish engineering artifacts (code, docs, talks, tutorials) and engage with OSS and customer engineers; turn repeated patterns into reusable templates.
  • Partner with product/solutions to launch LLM-powered features with clear latency/cost/SLO targets and safety/guardrail checks.
  • Run growth experiments to track the usage of the Weights & Biases suite of products from the artifacts built.

Benefits

  • Medical, dental, and vision insurance - 100% paid for by CoreWeave
  • Company-paid Life Insurance
  • Voluntary supplemental life insurance
  • Short and long-term disability insurance
  • Flexible Spending Account
  • Health Savings Account
  • Tuition Reimbursement
  • Ability to Participate in Employee Stock Purchase Program (ESPP)
  • Mental Wellness Benefits through Spring Health
  • Family-Forming support provided by Carrot
  • Paid Parental Leave
  • Flexible, full-service childcare support with Kinside
  • 401(k) with a generous employer match
  • Flexible PTO
  • Catered lunch each day in our office and data center locations
  • A casual work environment
  • A work culture focused on innovative disruption
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service