Tiktok-posted about 1 year ago
$137,750 - $237,500/Yr
Full-time • Mid Level
Hybrid • Seattle, WA
Computing Infrastructure Providers, Data Processing, Web Hosting, and Related Services

The Machine Learning Engineer - Model Serving Infrastructure role at TikTok focuses on designing and implementing distributed inference infrastructure for various ranking models, including feeds, ads, and search. This position is crucial for enhancing the performance and reliability of online inference servers, thereby supporting TikTok's mission to inspire creativity and bring joy. The role involves collaboration with product teams to meet their requirements and improve system performance through effective monitoring and management tools.

  • Design and implement distributed inference infrastructure for feeds, ads, and search ranking models.
  • Build monitoring and management tools to oversee the reliability and scalability of online inference servers.
  • Triage system inefficiencies and bottlenecks to improve system performance.
  • Analyze bottlenecks and sources of instability, then design and implement solutions.
  • Collaborate with product teams to provide general solutions that meet their requirements.
  • Bachelor's/Master's degree in Computer Science, Computer Engineering, or related fields, or equivalent years of experience in a software engineering role.
  • Proficient in C/C++/CUDA with solid programming skills.
  • Familiar with deep learning serving frameworks such as TensorFlow Serving or TorchScript.
  • Experience in GPU performance optimization.
  • Experience contributing to an open-sourced machine learning framework (e.g., TensorFlow, JAX, PyTorch, TorchScript, MXNet, TensorRT).
  • Experience in developing and deploying large-scale systems.
  • Strong background in Hardware-Software Co-Design, High Performance Computing, ML Hardware Acceleration, or ML for Systems.
  • Ability to work independently and complete projects from beginning to end in a timely manner.
  • Good communication and teamwork skills to clearly communicate technical concepts with teammates.
  • Medical, dental, and vision insurance from day one.
  • 401(k) savings plan with company match.
  • Paid parental leave.
  • Short-term and long-term disability coverage.
  • Life insurance.
  • Wellbeing benefits.
  • 10 paid holidays per year.
  • 10 paid sick days per year.
  • 17 days of Paid Personal Time (prorated upon hire with increasing accruals by tenure).
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service