About The Position

SMX is seeking a Junior Data Engineer (Training and Exercise Support) to assist in the development, operation, and sustainment of secure data pipelines that support mission-focused training and exercise activities at INSCOM Headquarters. This individual will support distributed data processing workflows, prepare and manage training datasets, and help maintain reliable data connectivity across hybrid on-premise and cloud environments under the guidance of senior engineers. The role involves supporting data integration, automation, documentation, and compliance efforts while collaborating with integrated project teams, exercise planners, and stakeholders to ensure data availability, integrity, and adherence to security and operational standards. This is a full-time onsite position in Ft. Belvoir, VA.

Requirements

  • Active TS security clearance and eligibility for SCI and NATO read on prior to starting work
  • Bachelor’s degree in a discipline covered under Science, Technology, Engineering or Mathematics (STEM) with up to 2 years’ experience as a software engineer or a data engineer
  • Up to 2 year experience with Python, Java, or other programming languages, e.g., in an academic or professional environment.
  • Up to 2 year experience with Git repositories and CI/CD pipelines. Examples include but are not limited to GitHub and GitLab, e.g., in an academic or professional environment.
  • Up to 2 year experience with distributed parallel streaming and batch data processing pipelines, e.g., in an academic or professional environment.
  • Up to 2 year experience integrating with data SDKs / APIs and data analytics SDKs / APIs, e.g., in an academic or professional environment.
  • Knowledge of database management systems, query languages, table relationships, and views.
  • Skill in communicating in writing.
  • Skill in communicating verbally.

Nice To Haves

  • Up to 2 years’ experience as a data engineer
  • Up to 2 years’ experience applying agile methodologies to the software development life cycle (SDLC)
  • 1 year of experience developing, operating, and maintaining data processing pipelines in a classified environment
  • 1 year of experience data mapping, modeling, enriching, and correlating classified data
  • 1 year of experience with Python / PySpark
  • 1 year of experience with Java / Java interface to Spark
  • 1 year of experience with Palantir Foundry • Knowledge of cloud computing deployment models in private, public, and hybrid environment and the difference between on-premises and off-premises environments.
  • Knowledge of cloud computing service models Software as Service (SaaS), Infrastructure as a Service (IaaS), and Platform as a Service (PaaS).
  • Knowledge of data security roles and responsibilities.
  • Knowledge of database access application programming interfaces (APIs) (e.g., Java Database Connectivity [JDBC]).
  • Knowledge of data administration and data standardization policies and standards.
  • Knowledge of data operations (DataOps) processes and best practices.
  • Skill in analyzing information from multiple sources.
  • Skill in creating technical documentation
  • Skill in analyzing large data sets.
  • Skill in collaborating with stakeholders.
  • Certification at the IAM II level, e.g., Security+ CE, GSLC.

Responsibilities

  • Assist in the development and implementation of distributed data processing pipelines, including data connection, parsing, normalization, mapping, and enrichment under the guidance of senior engineers.
  • Read, interpret, and modify basic scripts on Windows and UNIX systems to support tasks such as data parsing, process automation, and data fetching.
  • Support the operation and maintenance of data processing pipelines to ensure they meet the availability and performance requirements of the platform.
  • Perform secure programming practices by following established coding standards and assisting in the identification of potential flaws to mitigate vulnerabilities.
  • Prepare and manage test data required to support training events and exercise scenarios, ensuring datasets are properly staged and configured.
  • Work with integrated project teams and exercise planners to identify, curate, and manage data from various sources to meet specific training objectives.
  • Assist users and stakeholders by serving as a point of contact for data-related issues that arise during training events and exercises.
  • Help configure and troubleshoot data connections for a variety of sources across both on-premise and cloud environments, ensuring reliable data flow.
  • Respond to stakeholder requests for data access and connectivity, working with them to gather requirements and document technical needs for new data integrations.
  • Learn to operate tools and services in a hybrid infrastructure, gaining experience with both traditional systems and cloud platforms (e.g., AWS, Azure).
  • Update and maintain technical documentation such as System Design Documents (SDD), Standard Operating Procedures (SOPs), and Tactics, Techniques, and Procedures (TTPs).
  • Implement data management standards by complying with established requirements and specifications for data handling.
  • Adhere to all data classification and handling requirements by implementing access controls and following security best practices.
  • Follow agile methodologies as part of the system development life cycle (SDLC) and actively participate in team ceremonies.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service