Machine Learning System Hardware Architect

Baidu USASunnyvale, CA
Onsite

About The Position

We are looking for a world-class Machine Learning System Architect (HW) to join our SoC team at Baidu’s Sunnyvale office. The successful candidate will be a motivated self-starter who will thrive in this highly technical environment. Your job responsibilities as a Machine Learning System Architect will help the team to architect and create high-performance machine learning silicon and connect thousands of Kunlun Accelerators together for distributed AI training tasks. Create differentiated architectural innovations for Baidu’s Kunlun AI SoC roadmap. Architect, simulate, and design amazing machine learning solutions for our AI machine learning products. Develop system-level ML architectures that push the boundaries of performance, power, and latency; collaborate closely with many other teammates to ensure we design and optimize hardware and software for maximum performance. Monitor industrial and academic trends in artificial intelligence and determine where they should intersect our roadmaps. Drive partnerships for access to the most advanced AI technologies Evaluate the power, performance, and cost of prospective architecture and subsystems. Build scalable tools for modeling and performance evaluation. Engage with system and application software engineers to ensure optimization of the entire hardware/software stack. Engage with SoC design, verification, and validation engineers to realize the architecture.

Requirements

  • Knowledge of Machine Learning market, technological and business trends, software ecosystem, and emerging applications.
  • Proven track record 5+ years architecting hardware solutions for Machine Learning, acceleration and optimization.
  • Experience with deep learning frameworks including TensorFlow, PyTorch, PaddlePaddle, etc.
  • Strong track record of outreach to ML researchers and application developers.
  • Experience with CPUs, GPUs, memory systems, and accelerators.
  • Experience with performance simulation and modeling in C++
  • Experience with SoC interconnects and NoCs
  • Experience with area, frequency, and power optimizations
  • Familiarity with video, DSP, Ethernet, and PCIe
  • MS or PhD in Electrical or Computer Engineering.
  • Excellent communication skills in both English and Chinese.

Responsibilities

  • Create differentiated architectural innovations for Baidu’s Kunlun AI SoC roadmap.
  • Architect, simulate, and design machine learning solutions for AI machine learning products.
  • Develop system-level ML architectures that push the boundaries of performance, power, and latency.
  • Collaborate closely with teammates to ensure hardware and software are designed and optimized for maximum performance.
  • Monitor industrial and academic trends in artificial intelligence and determine where they should intersect roadmaps.
  • Drive partnerships for access to advanced AI technologies.
  • Evaluate the power, performance, and cost of prospective architecture and subsystems.
  • Build scalable tools for modeling and performance evaluation.
  • Engage with system and application software engineers to ensure optimization of the entire hardware/software stack.
  • Engage with SoC design, verification, and validation engineers to realize the architecture.
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service