About The Position

Conducts data analytics, data engineering, data mining, exploratory analysis, predictive analysis, and statistical analysis, and uses scientific techniques to correlate data into graphical, written, visual and verbal narrative products, enabling more informed analytic decisions. Proactively retrieves information from various sources, analyzes it for better understanding about the data set, and builds AI tools that automate certain processes. Duties typically include: creating various ML-based tools or processes, such as recommendation engines or automated lead scoring systems. Performs statistical analysis, applies data mining techniques, and builds high quality prediction systems. Should be skilled in data visualization and use of graphical applications, including Microsoft Office (Power BI) and Tableau; major data science languages, such as R and Python; managing and merging of disparate data sources, preferably through R, Python, or SQL; statistical analysis; and data mining algorithms. Should have prior experience with large data Multi-INT analytics, ML, and automated predictive analytics. Contractor shall: Create data packages, in the form of databases, reports, and visualization' Communicate ongoing data science activities, technical findings, and data products for both technical and non-technical customers Extract relevant features from large data stores containing open source, PIA, and CAI, containing bad records, partial records, errors, or other forms of "noising." Extract features from open source information stored in a wide range of possible formats, including JSON, XML, raw text logs, industry-specific encodings, and graph link data; Apply natural language processing, computer vision, signal processing, and speaker and speech recognition algorithms to identify objects in text, image, video, and audio files; Apply descriptive and inferential statistics to describe data and make predictions about the data, including statistical tests to determine confidence for a hypothesis, common summary statistics (e.g. mean, variance, and counts), fit distributions to datasets and use those distributions to predict event likelihoods; Be able to execute data science method using parallel computing frameworks (e.g. deepleaming4j, Torch, Tensor Flow, Caffe, Neon, NVIOFFICE CUDA Deep Neural Network library (cuDNN), and OpenCV)) and distributed data processing frameworks ( e.g. Hadoop (including HDFS, Hbase, Hive, Impala, Giraph, Sqoop ), Spark (inlcuding MLib, GraphX, SQL and Dataframes) Be able to execute data science method using common programming/scripting languages : Python , Java , Scala , R (statistics).

Requirements

  • Top Secret clearance required
  • Skilled in data visualization and use of graphical applications, including Microsoft Office (Power BI) and Tableau
  • Skilled in major data science languages, such as R and Python
  • Skilled in managing and merging of disparate data sources, preferably through R, Python, or SQL
  • Skilled in statistical analysis
  • Skilled in data mining algorithms
  • Prior experience with large data Multi-INT analytics
  • Prior experience with ML
  • Prior experience with automated predictive analytics

Responsibilities

  • Create data packages, in the form of databases, reports, and visualization'
  • Communicate ongoing data science activities, technical findings, and data products for both technical and non-technical customers
  • Extract relevant features from large data stores containing open source, PIA, and CAI, containing bad records, partial records, errors, or other forms of "noising."
  • Extract features from open source information stored in a wide range of possible formats, including JSON, XML, raw text logs, industry-specific encodings, and graph link data
  • Apply natural language processing, computer vision, signal processing, and speaker and speech recognition algorithms to identify objects in text, image, video, and audio files
  • Apply descriptive and inferential statistics to describe data and make predictions about the data, including statistical tests to determine confidence for a hypothesis, common summary statistics (e.g. mean, variance, and counts), fit distributions to datasets and use those distributions to predict event likelihoods
  • Be able to execute data science method using parallel computing frameworks (e.g. deepleaming4j, Torch, Tensor Flow, Caffe, Neon, NVIOFFICE CUDA Deep Neural Network library (cuDNN), and OpenCV)) and distributed data processing frameworks ( e.g. Hadoop (including HDFS, Hbase, Hive, Impala, Giraph, Sqoop ), Spark (inlcuding MLib, GraphX, SQL and Dataframes)
  • Be able to execute data science method using common programming/scripting languages : Python , Java , Scala , R (statistics).
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service