AI-Readiness & Data Automation Postdoctoral AI-Readiness & Data Automation Postdoctoral Scholar

Lawrence Berkeley National LaboratoryBerkeley, CA
Onsite

About The Position

The Earth and Environmental Sciences Area at Lawrence Berkeley National Laboratory (LBNL) seeks a postdoctoral researcher to develop AI-ready data for the U.S. Department of Energy’s ESS-DIVE repository. The DOE Biological and Environmental Research (BER) program produces valuable datasets increasingly used in AI/ML, but many are not AI-ready due to inconsistent formatting, missing metadata, or incompatible file types. The selected candidate will join an interdisciplinary team to improve how DOE environmental data is prepared for AI. This includes working with ESS-DIVE users and the broader community to create machine-readable data products and develop tools and guidance for contributors.

Requirements

  • Ph.D. in environmental science, earth science, informatics, or a closely related field.
  • Experience working with environmental/scientific datasets (cleaning, processing, analysis, synthesis).
  • Strong programming skills, especially Python (or comparable scientific programming).
  • Experience with LLM-assisted or agent-based workflows.
  • Strong written and oral communication skills, including the ability to explain technical requirements to non-experts.
  • Demonstrated record of scholarly or technical contributions (e.g., publications, reports, or significant software contributions).
  • Less than 3 years of paid postdoctoral experience.

Nice To Haves

  • Experience with metadata standards, data schemas, or FAIR principles, particularly with data formats commonly used in earth/environmental sciences (e.g., netCDF).
  • Experience building data pipelines for ingesting and harmonizing data from multiple sources and tracking data provenance.
  • Familiarity with agentic AI tooling such as Retrieval Augmented Generation (RAG) pipelines, agent skills, and Model Context Protocol (MCP) servers.
  • Ability/willingness to travel to partner institutions and conferences as needed.

Responsibilities

  • Develop practical guidance for what “AI-ready data” should include that extend beyond the FAIR (Findable, Accessible, Reusable, Interoperable) principles.
  • Build and extend tools that validate datasets, and check whether they meet AI-readiness requirements.
  • Help automate dataset preparation using reporting format templates and structured workflows.
  • Lead creation of example AI-ready benchmark datasets and supporting documentation.
  • Define AI-ready data standards: Establish and maintain guidance on metadata and formatting requirements for DOE environmental datasets.
  • Build automated checks and tools: Develop LLM-supported methods to assess AI readiness and convert datasets into consistent, usable formats.
  • Drive training and adoption: Create documentation, tutorials, and outreach to promote AI-ready data practices across the research lifecycle.
  • Curate benchmark datasets: Select, standardize, and document ESS-DIVE datasets for AI training and validation.
  • Support automated workflows: Contribute to developing agent-based pipelines that streamline data preparation, validation, and integration.

Benefits

  • Exceptional health and retirement benefits, including pension or 401K-style plans
  • A culture where you’ll belong - we are invested in our teams!
  • In addition to accruing vacation and sick time, we also have a Winter Holiday Shutdown every year.
  • Parental bonding leave (for both mothers and fathers)

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Entry Level

Education Level

Ph.D. or professional degree

© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service