Senior Scientist, Synthetic Data and Privacy

NVIDIA•Santa Clara, MA

2d•$168,000 - $264,500

About The Position

NVIDIA is at the forefront of the AI revolution, and our research is shaping the future of large language models. We are looking for a Senior Scientist to join our team and help advance our capabilities in generating synthetic data and privacy-preserving AI. You will contribute to open-source libraries within the NVIDIA NeMo ecosystem that enable high-quality synthetic data generation and data privacy at scale, including context-aware anonymization. This role combines hands-on software engineering with applied research in LLMs and privacy-enhancing methods, and you will collaborate with research, engineering, product teams, and external labs.

Requirements

PhD in Computer Science, Machine Learning, Statistics, or a related field, or equivalent experience.
A research background of 2+ years in applied LLM/NLP research and engineering, synthetic data generation, anonymization and PII detection, or related areas.
Comparable experience is also considered.
Proven track record of developing or maintaining software libraries used by a broad developer community.
Strong publication record at premier venues such as NeurIPS, ICML, ICLR, ACL or similar.

Nice To Haves

Active contributions to open-source projects, particularly in ML, security, or privacy domains.
Deep technical understanding of LLMs and inference optimization (quantization, distillation, latency/throughput tuning), with frameworks such as vLLM or TGI.
Ability to build and optimize scalable data processing pipelines for large-scale models.
Functional knowledge of global privacy regulations such as GDPR or CCPA.

Responsibilities

Build LLM-based methods for synthetic data generation, privacy, and context-aware anonymization, with automated evaluation across multilingual text, documents, and multimodal content.
Optimize task-specific LLMs for low-latency, high-throughput inference (distillation, quantization), and scale our frameworks to run in real time.
Design and maintain open-source libraries and SDKs with clean APIs and strong documentation.
Drive software excellence with modern tooling, architecture based on configuration, and professional Git/CI-CD.
Publish original research at top machine learning and AI conferences to maintain NVIDIA's technical leadership.
Mentor interns and junior researchers to develop technical growth within the team.