Data Architect

ATTAINX INCHerndon, VA
2hRemote

About The Position

The Data Architect / Data Engineering Lead provides technical leadership for data architecture, data engineering, database modernization, and AI/ML enablement across the NRCS IT ecosystem. This role is responsible for guiding the transformation of legacy data platforms – including monolithic SQL Server environments, SSIS-based ETL pipelines, and tightly coupled cross-database dependencies – into scalable, cloud-native architectures on AWS. The position works in close coordination with the Enterprise Lead Architect, Government Program Managers, and cross-functional delivery teams to execute data management, modernization, and operational sustainment activities under the OMNI contract.

Requirements

  • Experience: 10+ years of progressive experience in data architecture, data engineering, and database administration across enterprise environments.
  • Cloud Platforms: 5+ years of hands-on experience designing and deploying data solutions on AWS, including direct experience with S3, Glue, EMR/Spark, Lambda, Step Functions, DMS, RDS (PostgreSQL, Aurora), DynamoDB, OpenSearch, and Lake Formation.
  • Database Technologies: Deepexpertisein Microsoft SQL Server (including HA/DR configurations, performance tuning, stored procedures, and large-scale database operations) and PostgreSQL/PostGIS. Experience with database decoupling and monolithic database decomposition.
  • Data Engineering: Proven experience building production data pipelines using Spark,PySpark, Databricks, and AWS Glue for batch, streaming, and geospatial workloads. Experience modernizing legacy ETL (SSIS) to cloud-native frameworks.
  • Programming: Strongproficiencyin SQL/T-SQL, Python, andPySpark. Working knowledge of Bash/PowerShell for automation.
  • Architecture:Demonstratedability to design and implement enterprise data architectures including data warehouses, data lakes,lakehouses(Delta Lake), and service-layer integration patterns.
  • Federal IT: 3+ years of experience supporting federal IT programs, with familiarity with FISMA, NIST RMF, ATO processes, and federal change management requirements.
  • DevSecOps: Experience with CI/CD pipelines, Git-based version control, Terraform or CloudFormation, Liquibase, and automated quality/security gates.
  • Agile/SAFe: Experience working withinSAFeAgile or equivalent iterative delivery frameworks, including backlog management in Jira.
  • Must be able to obtain and maintain a USDA public trust clearance.

Nice To Haves

  • Direct experience with USDA NRCS systems, including NASIS, Soil Data Warehouse, Web Soil Survey, SSURGO, or related soil/conservation data platforms.
  • Experience with FPAC IT governance, the Technical Guidance Framework (TGF), and FPAC CI/CD pipeline standards.
  • Hands-on experience with AWS Bedrock, SageMaker, and Generative AI patterns (RAG, embeddings, natural-language-to-SQL,LangChain).
  • Experience with geospatial data engineering, includingPostGIS,GeoPackage, ArcGIS WFS/WMS services, and spatial data pipelines.
  • Experience with AI-enabled legacy modernization platforms (e.g., Rhino.ai or equivalent).
  • Azure experience (Synapse, ADF, ADLS, Azure ML Studio, Databricks on Azure) as a complement to primary AWS focus.
  • Relevant certifications: AWS Solutions Architect, AWS Data Analytics Specialty, Azure Data Engineer Associate (DP–203), or equivalent.
  • Master’s degree in Computer Science, Data Science, or related field (in progress acceptable).

Responsibilities

  • Data Architecture and Strategy
  • Define andmaintaindata architecture standards, patterns, and governance practices across all NRCS systems, ensuring alignment with FPAC’s Technical Guidance Framework (TGF), Cloud Memo directives, and Zero Trust principles.
  • Lead conceptual and logical decomposition of monolithic database structures (e.g., NPAD) into domain-aligned, modular schemas that support incremental modernization and cloud migration.
  • Architect service-layer data access patterns to replace direct cross-database queries and business logic embedded in stored procedures, reducing architectural fragility and enabling decoupled deployments.
  • Design andmaintaindata models for enterprise soil data systems including NASIS, Soil Data Warehouse (SDW), Soil Data Marts (SDM), and related spatial/tabular datasets.
  • Align supported systems with USDA’s cloud-native Lakehouse Data Strategy, including adoption of Databricks as the departmental standard data integration tool and elimination of duplicated data copies.
  • Register andmaintainschemas, interfaces, and metadata in AWSDataZone(or Government-directed metadata tooling), ensuring synchronization across environments.
  • Data Engineering and Pipeline Development
  • Design, build, andmaintainend-to-end data engineering pipelines using AWS-native services (Glue, EMR/Spark, Lambda, Step Functions,EventBridge, DMS, S3, RDS/Aurora PostgreSQL) for batch, streaming, geospatial, and near-real-time workloads.
  • Modernize legacy SSIS-based ETL/ELT pipelines to cloud-native equivalents (AWS Glue, Databricks,PySpark), improving scalability, maintainability, and operational efficiency.
  • Build andoperateAWS DMS full-load and CDC pipelines to support migration of SQL Server databases to PostgreSQL/PostGISand other target platforms.
  • Implement Delta Lake standards, partitioning strategies, and performance tuning across ingestion frameworks for structured, unstructured, and geospatial data.
  • Develop serverless orchestration workflows using Lambda,EventBridge, and Step Functions for event-driven processing and automated data operations.
  • Implement data quality controls (validation, reconciliation, monitoring) andmaintainaudit-ready evidence of data management activities.
  • Database Operations and Modernization
  • Provide senior-level DBA support for SQL Server clusters (including high-availability configurations, failover groups, and large-scale datasets exceeding 50 TB), as well as PostgreSQL/PostGIS, Aurora, and DynamoDB environments.
  • Lead database schema versioning, change tracking, and deployment automation using Liquibase and Government-approved CI/CD processes.
  • Execute database modernization activities including re-platforming from on-premises SQL Server to AWS RDS/Aurora, decoupling monolithic database dependencies, andeliminatingcross-database stored procedure calls.
  • Develop andmaintainapplication-specific database recovery runbooks, including validated restore procedures, dependency mapping, and configuration baselines aligned with DR/COOP requirements.
  • AI/ML and Generative AI Enablement
  • Design and implement AI/ML and Generative AI solutions using AWS services (Bedrock, SageMaker, OpenSearch) to support natural-language-to-SQL, automated metadata generation, conversational technicalassistance, and AI-powered data pipeline optimization.
  • Apply GenAI tooling (e.g., Bedrock,LangChain, embeddings, RAG patterns) to accelerate documentation, schema analysis, and DevOps workflows.
  • Support AI-assisted analysis to detect redundant data flows, schema drift, and opportunities to simplify data integrations.
  • Leverage AI-enabled platforms (e.g., Rhino.ai or equivalent) for legacy system discovery, business logic extraction, and modernization acceleration where authorized by the Government.
  • AWS Migration Support
  • Provide data engineering and DBAexpertisein support of the urgent AWS migration from DISC data centers, including troubleshooting, testing, and implementing operational adjustments tomaintaincontinuity of mission-critical business functions (e.g., payment processing).
  • Support full on-premises to AWS migration for databases and data infrastructure, including provisioning, lift-and-shift, re-architecture, data migration validation, and issue resolution.
  • Design and execute data migration and transformation activities, including test data management and privacy-preserving techniques for non-production environments.
  • Governance, Compliance, and Knowledge Transfer
  • Maintain audit-ready documentation for all data architecture decisions, schema changes, pipeline configurations, and modernization artifacts in Government-designated systems of record.
  • Enforce FPAC architectural principles, secure coding standards, and NIST SP 800–53 controls across all data engineering and database activities.
  • Conduct architecture reviews, design assurance gates, and code reviews for data-related deliverables, ensuring adherence to quality standards and FPAC SonarQube thresholds.
  • Deliver knowledge transfer sessions to Government personnel and incoming vendors during transition periods, including complete documentation handoff of data systems, pipelines, and architectural decisions.
  • Maintain and update troubleshooting playbooks, runbooks, and knowledge articles for data systems in Government-designated repositories.

Benefits

  • Competitive compensation and benefits packages including paid vacation, medical, dental, vision, matching 401K plan, tuition/training reimbursement, and Long & Short-Term Disability.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service