Data Scientist - Environmental Resilience Databank (KSEF)

Kentucky Science & Technology CorporationLexington, KY
Hybrid

About The Position

The CAPTIVATE Data Scientist supports researchers, educators (both teachers and students) and the public by providing expertise in data curation and management; data analysis and statistical modeling; as well as research computing across a range of environmental and climate science areas. This role collaborates closely with research teams and educators to design, manage and analyze complex datasets; develop reproducible workflows, and communicate results through publications, reports, presentations, and the CAPTIVATE web portal. The position emphasizes methodological rigor, transparency, and best practices in open and reproducible science and FAIR data principles. The Data Scientist will design, build, and maintain analytic and visualization tools for the CAPTIVATE KY databank, which integrates climate and environmental hazard data to support research, education, and community resilience across Kentucky. This role focuses on data pipelines, modeling, and user-facing tools aligned with the goals of the CAPTIVATE KY strategic plan. As a team, we recognize that the above description may not be all-inclusive and capture all potential ideal candidates. If you are a highly organized, skilled, and passionate professional looking to make an impact in our community, we invite you to apply.

Requirements

  • Master’s degree in data science, Statistics, Computer Science, or Environmental/Climate Science quantitative field
  • Bachelor’s degree in data science, Statistics, or Computer Science with 1 year of applicable experience can be substituted
  • Demonstrated experience supporting academic or scientific research
  • Proficiency in Python or R and common data science libraries (e.g., pandas, NumPy, scikit-learn, or tidyverse).
  • Experience with data pipelines (ETL/ELT), large/complex datasets, and SQL databases.
  • Experience creating data visualizations and interactive tools (e.g., Dash, Shiny, JupyterNoteboooks or OpenOnDemand).
  • Strong foundation in statistics and data analysis
  • Experience with data visualization and reproducible research practices
  • Excellent communication and collaboration skills

Nice To Haves

  • 3 years’ experience working in an academic or research-intensive environment
  • Familiarity with machine learning, Bayesian methods, or Climate Science data sciences
  • Experience with HPC, cloud computing, or scientific computational methods
  • Experience in multi-institutional, grant-funded, or university/research settings.
  • Teaching, mentoring, or workshop facilitation experience

Responsibilities

  • Research Support & Collaboration Partner with research teams and educators to design data-driven research projects
  • Advise on dataset design, metadata schema, sampling strategies, statistical methodologies and analytical and visualization tools
  • Advice on dataset management, access, retrieval and storage options
  • Design and maintain data ingestion, transformation, and quality-control pipelines for climate and environmental hazard datasets
  • Collaborate with system engineers on databank architecture, data models, and metadata for research, education, and community users.
  • Data Analysis & Modeling Clean, manage, and analyze structured and unstructured datasets
  • Apply statistical analysis, machine learning, and computational modeling techniques
  • Develop custom analysis pipelines using programming and statistical tools
  • Validate models and ensure methodological soundness
  • Research Computing & Reproducibility Develop reproducible workflows using version control, documentation, and automation
  • Support use of high-performance computing (HPC), cloud, or shared research infrastructure
  • Promote best practices in data management, FAIR principles, and open science
  • Assist with data sharing, archiving, and compliance with funding agency requirements
  • Training & Consultation Provide one-on-one consultations for researchers, teachers and students
  • Develop and deliver workshops or short courses on data science methods and tools
  • Create documentation, tutorials, and example code for common research workflows
  • Communication & Visualization Produce clear data visualizations and summaries for academic and non-technical audiences
  • Assist in preparing figures, tables, and supplementary materials for CAPTIVATE publications including the web portal
  • Communicate complex analytical results clearly and effectively

Benefits

  • KSTC is an equal opportunity employer and offers a competitive salary and benefits package.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service