Sr. Technical Engineer Summary This role focuses on driving technical excellence by architecting, developing, and continuously improving advanced repair processes for our high-performance AI server infrastructure. This position requires deep hardware expertise, a methodical approach to troubleshooting, and the ability to innovate scalable repair solutions. As a key technical owner, you will lead efforts in designing processes, developing diagnostic tools, conducting root cause analysis, and influencing hardware designs to improve serviceability. Your mission is to establish a center of excellence for AI server repair through engineering rigor, advanced analysis, and reliable processes. Key Responsibilities Process Development & Implementation (Primary Focus): Process Design: Architect, document, and execute end-to-end workflows for diagnosing and repairing AI servers and components. Develop detailed Standard Operating Procedures (SOPs), diagnostic flowcharts, and job-specific work instructions. Tool Development: Design and implement advanced diagnostic software tools, scripts, and physical fixtures to improve accuracy, efficiency, and repeatability of troubleshooting and repair activities. Advanced Validation: Define, test, and validate comprehensive test plans for components and full systems to meet high performance and reliability standards. Process Control: Establish control points within workflows to monitor repair quality and gather repair and failure data for analysis and continuous improvement. Failure Analysis & Advanced Engineering Support (Primary Focus): Problem-Solving Expertise: Serve as the technical escalation point for resolving the most complex hardware issues. Triage, troubleshoot, and drive resolution for rare or unknown failure types. Root Cause Isolation: Conduct deep Root Cause Analysis (RCA), involving schematic interpretation, board-level diagnostics, and meticulous troubleshooting to identify primary causes of failure. Collaboration with Core Engineering Teams: Partner with Product Design, R&D, and Hardware Engineering teams to provide actionable feedback on failure trends and design weaknesses. Collaborate to influence future products with a focus on improved serviceability and reliability. Technical Advancement and Guidance: Development & Training: Create technical resources, training materials, and detailed documentation to propagate advanced diagnostic techniques and repair processes. Knowledge Leadership: Serve as the primary source of technical expertise for the repair center, providing guidance and empowering technicians and engineers with advanced troubleshooting methodologies and engineering insights. Prototyping & Innovation: Drive innovation through iterative prototyping and development of robust repair workflows to improve efficiency and system reliability. Analytics and Continuous Improvement: Data Analysis: Regularly analyze repair data to identify systemic failure trends, optimize existing processes, and track performance metrics such as test yields, repair turn-around times, and cost. Process Optimization: Initiate and lead engineering-driven process improvement projects informed by data analysis to ensure consistent, high-quality repairs. Feedback Loop with Manufacturing & Design: Support continuous improvement by providing actionable insights to manufacturing, design, and quality teams based on repair data and failure modes.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Mid Level