About The Position

We are looking for a Databricks Developer who can design, build, and support large-scale data processing solutions. This role focuses on working with complex, high-volume data in a secure environment. You should be comfortable writing clean Java code, building Spark jobs, and improving performance in distributed systems. You’ll work closely with engineers, analysts, and business teams to deliver reliable data pipelines that scale well and meet security and compliance requirements.

Requirements

  • Active federal background clearance (client-issued laptop is a plus)
  • Bachelor’s degree in Computer Science or a related field
  • 8+ years of hands-on experience in data engineering or big data development
  • Experience working with large enterprise or government data systems
  • Good understanding of data structures, security, and compliance
  • Strong experience with Java 8 or newer
  • Comfortable with Streams, Lambdas, and object-oriented design
  • Hands-on experience with Spark Core, Spark SQL, and DataFrames
  • Understanding of RDDs and when to use them
  • Experience with batch processing and streaming
  • Strong skills in Spark performance tuning
  • Able to debug jobs using Spark UI and logs
  • Experience with tools like HDFS, Hive, or HBase
  • Familiar with Kafka, S3, or cloud-based data lakes
  • Experience with Parquet, Avro, or ORC formats
  • Experience building batch and real-time ETL pipelines
  • Skilled in data cleansing, transformation, and enrichment
  • Experience running Spark on YARN, Kubernetes, or cloud platforms
  • Familiar with CI/CD tools like Jenkins or GitHub Actions
  • Experience with monitoring tools and production logs
  • Unit testing using JUnit or TestNG
  • Experience with Mockito or similar tools
  • Comfortable validating data and testing Spark jobs
  • You’re used to working in Agile teams
  • You document your work clearly
  • You stay calm and focused when troubleshooting production issues

Nice To Haves

  • Experience with Scala or Python in Spark
  • Hands-on experience with Databricks or Google Dataproc
  • Knowledge of Delta Lake or Apache Iceberg
  • Background in data modeling and performance design

Responsibilities

  • Build and maintain data pipelines using Apache Spark on Databricks
  • Write clean and efficient Java (Java 8 or higher) following best coding practices
  • Work with large structured and semi-structured data sets
  • Tune Spark jobs to improve performance, stability, and cost
  • Partner with other teams to understand requirements and deliver solutions
  • Make sure data handling follows security and governance standards
  • Investigate and fix issues in production environments
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service