Data Scientist - Tech - Top Secret required to apply - DC area

Bow Wave•Reston, VA

11h•Onsite

About The Position

Conducts data analytics, data engineering, data mining, exploratory analysis, predictive analysis, and statistical analysis, and uses scientific techniques to correlate data into graphical, written, visual and verbal narrative products, enabling more informed analytic decisions. Proactively retrieves information from various sources, analyzes it for better understanding about the data set, and builds AI tools that automate certain processes. Duties typically include: creating various ML-based tools or processes, such as recommendation engines or automated lead scoring systems. Performs statistical analysis, applies data mining techniques, and builds high quality prediction systems. Should be skilled in data visualization and use of graphical applications, including Microsoft Office (Power BI) and Tableau; major data science languages, such as R and Python; managing and merging of disparate data sources, preferably through R, Python, or SQL; statistical analysis; and data mining algorithms. Should have prior experience with large data Multi-INT analytics, ML, and automated predictive analytics. Contractor shall: Create data packages, in the form of databases, reports, and visualization' Communicate ongoing data science activities, technical findings, and data products for both technical and non-technical customers Extract relevant features from large data stores containing open source, PIA, and CAI, containing bad records, partial records, errors, or other forms of "noising." Extract features from open source information stored in a wide range of possible formats, including JSON, XML, raw text logs, industry-specific encodings, and graph link data; Apply natural language processing, computer vision, signal processing, and speaker and speech recognition algorithms to identify objects in text, image, video, and audio files; Apply descriptive and inferential statistics to describe data and make predictions about the data, including statistical tests to determine confidence for a hypothesis, common summary statistics (e.g. mean, variance, and counts), fit distributions to datasets and use those distributions to predict event likelihoods; Be able to execute data science method using parallel computing frameworks (e.g. deepleaming4j, Torch, Tensor Flow, Caffe, Neon, NVIOFFICE CUDA Deep Neural Network library (cuDNN), and OpenCV)) and distributed data processing frameworks ( e.g. Hadoop (including HDFS, Hbase, Hive, Impala, Giraph, Sqoop ), Spark (inlcuding MLib, GraphX, SQL and Dataframes) Be able to execute data science method using common programming/scripting languages : Python , Java , Scala , R (statistics).

Requirements

Top Secret clearance required
Skilled in data visualization and use of graphical applications, including Microsoft Office (Power BI) and Tableau
Skilled in major data science languages, such as R and Python
Skilled in managing and merging of disparate data sources, preferably through R, Python, or SQL
Skilled in statistical analysis
Skilled in data mining algorithms
Prior experience with large data Multi-INT analytics
Prior experience with ML
Prior experience with automated predictive analytics

Responsibilities

Create data packages, in the form of databases, reports, and visualization'
Communicate ongoing data science activities, technical findings, and data products for both technical and non-technical customers
Extract relevant features from large data stores containing open source, PIA, and CAI, containing bad records, partial records, errors, or other forms of "noising."
Extract features from open source information stored in a wide range of possible formats, including JSON, XML, raw text logs, industry-specific encodings, and graph link data
Apply natural language processing, computer vision, signal processing, and speaker and speech recognition algorithms to identify objects in text, image, video, and audio files
Apply descriptive and inferential statistics to describe data and make predictions about the data, including statistical tests to determine confidence for a hypothesis, common summary statistics (e.g. mean, variance, and counts), fit distributions to datasets and use those distributions to predict event likelihoods
Be able to execute data science method using parallel computing frameworks (e.g. deepleaming4j, Torch, Tensor Flow, Caffe, Neon, NVIOFFICE CUDA Deep Neural Network library (cuDNN), and OpenCV)) and distributed data processing frameworks ( e.g. Hadoop (including HDFS, Hbase, Hive, Impala, Giraph, Sqoop ), Spark (inlcuding MLib, GraphX, SQL and Dataframes)
Be able to execute data science method using common programming/scripting languages : Python , Java , Scala , R (statistics).

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume