AMD-posted 2 months ago
Senior
San Jose, CA
Computer and Electronic Product Manufacturing

At AMD, our mission is to build great products that accelerate next-generation computing experiences—from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to create something extraordinary. When you join AMD, you'll discover the real differentiator is our culture. We push the limits of innovation to solve the world's most important challenges—striving for execution excellence, while being direct, humble, collaborative, and inclusive of diverse perspectives. Join us as we shape the future of AI and beyond. Together, we advance your career. The role involves a senior technical contributor that drives end-to-end delivery of software solutions, directly contributing to, and coordinating implementation and optimization across multiple teams for inference and training of machine learning models. The position will involve interfacing with software and hardware engineering teams and AMD partners to plan, develop and optimize use cases. This is an exciting opportunity to work on the cutting edge of GPU Computing for Machine Learning.

  • Develop and implement the overall QA strategy and frameworks for testing GPU-based software products, spanning various hardware and software configurations.
  • Evaluate and improve existing QA methodologies, tools, and processes and best practices, including automation tools, testing methodologies, test configuration management, and performance testing techniques.
  • Collaborate with software developers, program managers, QA teams, and other stakeholders to incorporate their feedback into test strategy and design.
  • Define cataloging methods for test plans, test suites, and test cases that cover functional and non-functional requirements.
  • Analyze and debug complex failure scenarios in GPU software environment, including root cause analysis and implementation of corrective actions.
  • Establish and monitor metrics to assess the efficiency and effectiveness of the Software development process, utilizing data-driven insights to drive continuous improvement.
  • Provide training and mentorship to QA engineers and other stakeholders on best practices, testing methodologies, and tools used in the QA process.
  • Stay current with the latest trends and technologies in the Compute domain to ensure the implementation of best practices and cutting-edge testing methodologies.
  • Aware of industry standards and regulations, including ISO, IEEE, and other relevant standards.
  • Relevant experience in Machine Learning and/or GPU programming.
  • Experience in deep learning frameworks (e.g. TensorFlow, Keras, PyTorch, Caffe, ONNX, etc) and familiarity with CNN/LSTM model architectures.
  • Knowledge of CPU and GPU architecture, and experience in GPGPU programming technologies.
  • Proven experience in a SW or QA Architect or Senior Technical Engineer role.
  • Strong knowledge of software development methodologies, tools, and processes, including test planning, test design, test execution, and defect management.
  • Expertise in embedded software process, systems architecture and GPU technologies, including programming skills, such as C, C++ and Python.
  • Familiarity with various GPU hardware platforms and wide variety of operating systems (Linux and Windows) variants.
  • Experience with automated testing tools as well as experience in Continuous Integration and Continuous Deployment (CI/CD) pipelines process.
  • Strong analytical and problem-solving skills, with an ability to debug and resolve complex issues in software systems.
  • Excellent communication, collaboration skills, with the ability to effectively work with cross-functional teams and diverse stakeholders.
  • Led or played key role in QA teams' transformations to agile development and validation methods.
  • Proven experience in a SW or QA Architect or Senior Technical Engineer role.
  • Strong knowledge of software development methodologies, tools, and processes, including test planning, test design, test execution, and defect management.
  • AMD benefits at a glance.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service