CPU Post-Silicon RAS Engineer

QualcommSanta Clara, CA
Onsite

About The Position

We are seeking a post‑silicon CPU RAS engineer focused on Silent Data Corruption (SDC) / Silent Data Errors (SDE) on ARM ISA–based CPUs running on silicon platforms. This role centers on reproducing, detecting, and root‑causing failures escaping, often observed in customer or field environments. You will work hands‑on with bring‑up systems, validation boards, and customer platforms, closely partnering with Customer Engineering, architecture, RTL, firmware, and post‑silicon debug teams to drive root cause and mitigation.

Requirements

  • Strong C, Python, ARM assembly programming skills.
  • Solid understanding of computer architecture concepts, OOO execution, ARM ISA
  • Experience debugging silicon failures.
  • Ability to reason across hardware, firmware, OS, and workload behavior.
  • Strong analytical and communication skills for customer‑impacting issues.

Nice To Haves

  • Experience working directly with customers or field teams on CPU issues.
  • Familiarity with RAS and reliability features (detection, containment, escalation).
  • Experience with silicon bring‑up, validation boards, or datacenter/enterprise CPU deployments.

Responsibilities

  • Develop and run C and assembly tests directly on silicon (Bare Metal/OS based) to provoke and detect silent corruption in: CPU pipelines, load/store, atomics, coherency, and cache interactions
  • Build robust correctness oracles (redundant execution, invariants, checksums) to catch subtle wrong‑answer failures.
  • Analyze and root‑cause customer‑reported issues with symptoms such as data corruption, miscompares, or non‑deterministic failures without explicit machine checks.
  • Work closely with Customer Engineering to: Reproduce issues in lab or customer‑like environments, Triage logs, dumps, and limited telemetry, Provide clear technical root‑cause hypotheses and mitigation guidance
  • Use post‑silicon debug techniques: Performance counters, trace, JTAG, register/state dumps, targeted instrumentation
  • Reduce issues to minimal repros (often assembly‑only) to enable efficient handoff to RTL and architecture teams.
  • Integrate high‑signal SDC tests into ongoing datacenter stress and regression flows.

Benefits

  • competitive annual discretionary bonus program
  • opportunity for annual RSU grants
  • highly competitive benefits package
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service