Data Management Engineer

Booz Allen HamiltonMcLean, VA
20h$62,000 - $141,000

About The Position

Data Management Engineer The Opportunity: Ever-expanding technology like IoT, machine learning, and artificial intelligence means that there’s more structured and unstructured data available today than ever before. As a data engineer, you know that organizing data can yield pivotal insights when it’s gathered from disparate sources. We need a data professional like you to help our clients find answers in their data to impact important missions—from fraud detection to cancer research to national intelligence. As a data engineer at Booz Allen, you’ll use your skills and experience to help build advanced technology solutions and implement data engineering activities on some of the most mission-driven projects in the industry. You’ll develop and deploy the pipelines and platforms that organize and make disparate data meaningful. Here, you’ll work with a multi-disciplinary team of analysts, data engineers, developers, and data consumers in a fast-paced, agile environment. You’ll sharpen your skills in analytical exploration and data examination while you support the assessment, design, developing, and maintenance of scalable platforms for your clients. Due to the nature of work performed within this facility, U.S. citizenship is required. Work with us to use data for good. Join us. The world can’t wait.

Requirements

  • 3+ years of experience managing large-scale data ecosystems, including designing storage strategies such as intelligent tiering for cost optimization and performance
  • Experience with metadata management tools such as OpenMetadata, including setup, configuration, and integration into data pipelines to ensure discoverability, lineage tracking, and governance
  • Experience integrating storage and metadata strategies into Agile, cross-functional development teams to ensure alignment with real-time and batch processing pipelines
  • Knowledge of data lifecycle management strategies, including tiering data across hot, warm, and cold storage layers, retention policies, and archival workflows to support petabyte-scale and continuously-ingested datasets
  • Knowledge of cloud-based storage systems and intelligent tiering features, such as AWS S3 Intelligent-Tiering, or in Azure or GCP, including APIs and configuration
  • Knowledge of data observability advancements to monitor, assess, and optimize the performance of pipelines and data storage systems in distributed cloud environments
  • Ability to work with large datasets using programming languages such as Python to develop and optimize data organization, storage, and transformation workflows
  • Ability to obtain and maintain a Public Trust or Suitability/Fitness determination based on client requirements
  • Bachelor’s degree in Computer Science or Data Engineering

Nice To Haves

  • Experience configuring cloud-native tools for policies on intelligent tiering, such as automating data movement between storage tiers using Lambda, Step Functions, or equivalent workflows
  • Experience working with OpenMetadata integrations into existing tools like Apache Airflow, Kubernetes, and large-scale orchestration systems, ensuring metadata catalogs automatically synchronize with pipeline operations
  • Experience working with containerized environments, such as Docker or Kubernetes, and modern orchestration tools, such as Airflow or Prefect, to optimize both metadata workflows and storage pipelines
  • Experience working with implementing data governance frameworks such as access controls and lineage policies that integrate with OpenMetadata or equivalent metadata tools
  • Knowledge of geospatial datasets or scientific data formats such as NetCDF, GRIB, or HDF5 commonly used in weather or satellite data systems, and their implications for storage architecture
  • Knowledge of distributed query engines such as Presto, Trino, Hive, or Spark, tuned for performance on a lakehouse or intelligent tiering-enabled data lake architecture.
  • Knowledge of real-time data streaming tools and integrations such as Kafka or AWS Kinesis, ensuring metadata tracks changes and tiering strategies accommodate time-sensitive ingestion workflows.
  • Knowledge of cost modeling for cloud data storage, with the Knowledge of Agile engineering practices, including CICD pipelines and collaboration with data engineers, AI engineers, and product teams to iteratively deliver optimized data ecosystems
  • Ability to analyze usage patterns and recommend optimizations for performance, cost, and data accessibility at scale

Benefits

  • At Booz Allen, we celebrate your contributions, provide you with opportunities and choices, and support your total well-being. Our offerings include health, life, disability, financial, and retirement benefits, as well as paid leave, professional development, tuition assistance, work-life programs, and dependent care.
  • Our recognition awards program acknowledges employees for exceptional performance and superior demonstration of our values.
  • Full-time and part-time employees working at least 20 hours a week on a regular basis are eligible to participate in Booz Allen’s benefit programs.
  • Individuals that do not meet the threshold are only eligible for select offerings, not inclusive of health benefits.
  • We encourage you to learn more about our total benefits by visiting the Resource page on our Careers site and reviewing Our Employee Benefits page.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service