Staff Data Engineer/Scientist

CACIChantilly, VA

About The Position

We are looking for a Staff Data Engineer/Scientist looking for new challenging problems. You will support the development of AI/ML algorithms in a multitude of disciplines from large language models, natural language processing, and time-series predictive analytics. Additionally, we have a team of excellent researchers and software developers who are eager to mentor and teach their craft.

Requirements

  • B.S. in data science, AI/ML, computer science, or related field
  • Minimum six (6) years of relevant experience as a Data Engineer/Scientist.
  • Experience developing data pipelines and normalizing data with canonical Python packages (e.g. NumPy, Pandas, Polars)
  • Experience contributing on a team using version control (e.g. git, GitLab, Bitbucket)
  • Active TS/SCI U.S. Government Security Clearance with a recent Full-Scope Polygraph (FSP)

Nice To Haves

  • M.S. or PhD in data science, AI/ML, computer science, or related field
  • Experience with Gitlab, DevSecOps utilizing test-driven development, containers, (e.g. Docker, Docker Compose), cloud services (e.g. AWS), tools for distributed computing (e.g. Spark, Pyspark)
  • Experience leading an interdisciplinary team of researchers and software developers
  • Experience with any of the following: Large Language Models and experience identifying ways to incorporate them into new domains and applications
  • Applying Transformer-based architectures to domains in other areas outside of Natural Language Processing (NLP) such as computer vision
  • Natural Language Processing algorithms such as BERT
  • Reinforcement learning and familiarity with Gymnasium Gym, OpenEnv, TorchRL, RLlib, and Stable Baselines
  • Applying clustering algorithms and/or deep neural networks to real life problems
  • Implementing tracking and pattern-of-life algorithms
  • Experience with GenAI Ops techniques (e.g. LLM-as-a-judge) and frameworks (e.g. LangFuse, MLFlow, Arize Phoenix)
  • Experience with Machine Learning libraries and frameworks such as HuggingFace and LangChain
  • Experience with Linux
  • Familiarity with using AWS cloud computing resources such as EC2, S3, Lambda, Bedrock, etc.
  • Experience with any of the following additional languages: Java, C++, Rust, Go, and/or C#
  • Experience implementing algorithms on the GPU in Python or C++ using CUDA and other CUDA libraries
  • Experience in application deployment, virtualization, and containerization (e.g. Podman, Docker, Kubernetes, Rancher)
  • Experience shaping and writing proposals

Responsibilities

  • Lead and mentor an interdisciplinary team consisting of both developers and researchers. The team's core focus is the implementation of ETL pipelines to support a variety of AI/ML and LLM solutions, which in turn address a broad range of customer challenges.
  • Assembles large, complex sets of data to support AI/ML algorithm implementation
  • Builds required infrastructure for optimal extraction, transformation and loading of data from various data sources
  • Curate and maintain data that is stored in support of metrics and evaluation
  • Implement Artificial Intelligence/Machine Learning algorithms
  • Identifies, designs, and implements internal process improvements including re-designing infrastructure for greater scalability, optimizing data delivery, and automating manual processes
  • Using Agile methodologies to develop software.

Benefits

  • Our employees value the flexibility at CACI that allows them to balance quality work and their personal lives.
  • We offer competitive compensation, benefits and learning and development opportunities.
  • Our broad and competitive mix of benefits options is designed to support and protect employees and their families.
  • At CACI, you will receive comprehensive benefits such as; healthcare, wellness, financial, retirement, family support, continuing education, and time off benefits.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service