About The Position

Sandia's AI team 1466 is building DOE's next-generation AI Platform around three pillars Data, Models, and Infrastructure to solve high-impact "lighthouse problems" in agile deterrence, energy dominance, and critical minerals. As a Postdoctoral Appointee, you"ll join the Data Pillar team to design, implement, and operate Sandia's AI-ready, zero-trust data ecosystem. Your work will transform raw simulation outputs, sensor and facility logs, experimental records, and production data into governed, provenance-tracked, and access-controlled datasets that power AI models, autonomous agents, and mission workflows across DOE's HPC, cloud, and edge environments.

Requirements

  • Possess, or are pursuing, a PhD in Computer Science, Data Science, Statistics, or a related science or engineering field, PhD must be conferred within five years prior to employment
  • Experience or knowledge in building and maintaining production data pipelines (ETL/ELT) and data warehouses or data lakes
  • Proficiency in programming languages such as Python, SQL, and experience with frameworks like Apache Spark or Dask
  • Understanding of data security and zero-trust principles, including secure enclaves, attribute-based access control, and data masking or differential privacy
  • Familiarity with cloud platforms (AWS, Azure, or GCP) and container orchestration (Kubernetes)
  • Ability to acquire and maintain a DOE Q-level security clearance

Nice To Haves

  • Significant data research experience
  • Background in AI-mediated data curation: automated annotation, feature extraction, and dataset certification
  • Experience implementing data governance and metadata management tools (e.g., Apache Atlas, DataHub, Collibra)
  • Experience developing and refining data architectures and data flows
  • Hands-on background in MLOps and CI/CD for data and ML workflows (e.g., Jenkins, GitLab CI, MLflow)
  • Knowledge of human-factors engineering and UX design principles for data platforms
  • Knowledge of agile principles and practices and experience working as part of agile teams
  • Ability to work effectively in a dynamic, interdisciplinary environment, guiding technical decisions and mentoring junior staff
  • Strong written and verbal communication skills, with the ability to present complex data concepts to diverse audiences
  • Ability to obtain and maintain a SCI clearance, which may require a polygraph test

Responsibilities

  • Build and operate an AI-Ready Lakehouse
  • Design and maintain a federated data lakehouse with full provenance/versioning, attribute-based access control, license/consent automation, and agent telemetry services
  • Implement automated, AI-mediated ingestion pipelines for heterogeneous sources (HPC simulation outputs, experimental instruments, robotics, sensor streams, satellite imagery, production logs)
  • Enforce Data Security & Assurance
  • Develop a Data Health & Threat program: dataset fingerprinting, watermarking, poisoning/anomaly detection, red-team sampling, and reproducible training manifests
  • Configure secure enclaves and egress processes for CUI, Restricted Data, and other sensitive corpora with attestation and differential-privacy where required
  • Define and Implement Data Governance
  • Establish FAIR-compliant metadata standards, data catalogs, and controlled-vocabulary ontologies
  • Automate lineage tracking, quality checks, schema validation, and leak controls at record-level granularity
  • Instrument AI Workflows with Standardized Telemetry
  • Deploy Agent Trace Schema (ATS) and Agent Run Record (ARR) frameworks to log tool calls, decision graphs, human hand-offs, and environment observations
  • Treat agent-generated artifacts (plans, memory, configurations) as first-class data objects
  • Collaborate Across Pillars
  • Work with Models and Interfaces teams to integrate data services into training, evaluation, and inference pipelines
  • Partner with Infrastructure engineers to optimize data movement, tiered storage, and high-bandwidth networking (ESnet) between HPC, cloud, and edge
  • Engage domain scientists and mission leads for agile deterrence, energy grid, and critical minerals use cases to curate problem-specific datasets
  • Support Continuous Acquisition & Benchmarking
  • Design edge-to-exascale data acquisition systems with robotics and instrument integration
  • Develop data/AI benchmarks—datasets, tools, and metrics—for pipeline performance, model evaluation, and mission KPIs

Benefits

  • Generous vacation
  • Strong medical and other benefits
  • Competitive 401k
  • Learning opportunities
  • Relocation assistance
  • Amenities aimed at creating a solid work/life balance

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Entry Level

Industry

National Security and International Affairs

Education Level

Ph.D. or professional degree

Number of Employees

5,001-10,000 employees

© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service