Apache Spark Developer

Absolute Business Solutions CorpHerndon, VA
Onsite

About The Position

Absolute Business Solutions Corp (ABSC) is seeking a TS/SCI-cleared Apache Spark Developer to support NGA �s Data Modernization Services (DMS) mission by building and optimizing large-scale data processing pipelines. This role focuses on developing high-performance Spark applications within a containerized, Kubernetes-based environment, supporting mission analytics, data exploitation, and AI/ML integration. The ideal candidate thrives in distributed data environments, understands performance tuning deeply, and can operate effectively in secure, air-gapped systems. This role is on-site/flexible hours in Herndon, VA; Springfield, VA; St. Louis, MO; or Aurora, CO. Clearance Required for this role: TS/SCI eligibility with willingness/ability to obtain CI polygraph.

Requirements

  • TS/SCI (eligibility) with ability/willingness to obtain/maintain counterintelligence polygraph
  • Bachelor�s degree plus 5 years� experience in data engineering or Spark development (will entertain additional years� experience in lieu of degree)
  • Strong hands-on experience with: Apache Spark (mandatory), Python (PySpark), Data processing at scale
  • Experience working with: Parquet and/or Delta Lake, Distributed data systems
  • Familiarity with: Docker / containerization, Kubernetes (basic to intermediate experience)
  • Experience with object storage systems (e.g., S3 or equivalent)
  • Strong troubleshooting and performance tuning skills
  • Proficiency in Bash or scripting

Nice To Haves

  • Experience with Scala for Spark development
  • Experience with Structured Streaming in production environments
  • Familiarity with Iceberg or lakehouse architectures
  • Experience with CI/CD pipelines (Jenkins, Git)
  • Exposure to Terraform or Infrastructure as Code
  • Experience supporting AI/ML data pipelines
  • Prior experience supporting NGA, IC, or DoD programs

Responsibilities

  • Design, develop, and maintain Apache Spark pipelines (batch and streaming) using PySpark and/or Scala
  • Process and transform large-scale datasets using modern data lake architectures (Delta Lake, Parquet)
  • Optimize Spark jobs for performance, including: Partitioning strategies, Shuffle optimization, Memory tuning, File sizing and storage efficiency
  • Implement Structured Streaming pipelines for near real-time data processing
  • Develop and deploy Spark applications within containerized environments (Docker)
  • Execute workloads in Kubernetes clusters, supporting scalable and distributed processing
  • Integrate Spark pipelines with downstream systems, including: Analytics platforms (SQL, notebooks), AI/ML workflows and feature engineering pipelines
  • Support data ingestion and storage in object-based systems (e.g., S3-compatible storage)
  • Troubleshoot data pipeline failures and ensure reliability in mission-critical environments
  • Operate within secure, air-gapped environments, including: Managing dependencies without internet access, Working within controlled network and security constraints

Benefits

  • Generous PTO plus 11 Federal Holidays
  • Retirement Planning � 401k Fully Vested with Match
  • Tuition Assistance Program � Annual contributions to help you pay down your loans
  • Annual Health and Wellness Allowance � buy an Apple Watch, a treadmill, or hit the gym on us
  • Career Development � Annual Funds to spend on Education and Training
  • Volunteer Time Off � Annually, all employees can spend 8 hours directly supporting a charity of choice
  • Charitable Match � ABSC matches an employee�s donation to a qualifying charity
  • Referral Program � We pay for internal and external referrals!
  • LOV Awards � Earn bonus awards throughout the year from our Living Our Values awards program
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service