Sr. Lead Data Engineer

eStaffLLCHuntsville, TX
1dHybrid

About The Position

We are seeking a Sr. Lead Data Engineer for a hybrid for a W2 position (Must work on our W2 and commute to Huntsville, Texas when required ). Will lead the design, implementation, and management of end-to-end, enterprise-grade data solutions for our Huntsville, Texas, client. This role requires expertise in building and optimizing data warehouses, data lakes, and lake house platforms, with a strong emphasis on data engineering, data science, and machine learning. You will work closely with cross-functional teams to create scalable and robust architectures that support advanced analytics and machine learning use cases, while adhering to industry standards and best practices.

Requirements

  • Bachelor's Computer Science, Data Science, Engineering, or a related field.
  • Minimum 10 years in data engineering, data architecture, or a similar role, with at least 3 years in a lead capacity.
  • Proficient in SQL, Python, and big data processing frameworks (e.g., Spark, Flink).
  • Strong experience with cloud platforms (AWS, Azure, GCP) and related data services.
  • Hands-on experience with data warehousing tools (e.g., Snowflake, Redshift, BigQuery), Databricks running on multiple cloud platforms (AWS, Azure and GCP) and data lake technologies (e.g., S3, ADLS, HDFS).
  • Expertise in containerization and orchestration tools like Docker and Kubernetes.
  • Knowledge of MLOps frameworks and tools (e.g., MLflow, Kubeflow, Airflow).
  • Experience with real-time streaming architectures (e.g., Kafka, Kinesis).
  • Familiarity with Lambda and Kappa architectures for data processing.
  • Enable integration capabilities for external tools to perform ingestion, compilation, analytics and visualization.

Nice To Haves

  • Certifications in cloud platforms or data-related technologies.
  • Familiarity with graph databases, NoSQL, or time-series databases.
  • Knowledge of data privacy regulations (e.g., GDPR, CCPA) and compliance requirements.
  • Experience in implementing and managing business glossaries, data governance rules, metadata lineage, and ensuring data quality.
  • Highly experienced with AWS cloud platform and Databricks Lakehouse.

Responsibilities

  • Architect, design, and manage the entire data lifecycle from data ingestion, transformation, storage, and processing to advanced analytics and machine learning databases and large-scale processing systems.
  • Implement robust data governance frameworks, including metadata management, lineage tracking, security, compliance, and business glossary development.
  • Identify, design, and implement internal process improvements, including redesigning infrastructure for greater scalability, optimizing data delivery, and automating manual processes.
  • Ensure high data quality and reliability through automated data validation and testing and provide high quality clean, and usable data from data sets of varying states of disorder.
  • Develop and enforce architecture standards, patterns, and reference models for large-scale data platforms.
  • Architect and implement Lambda and Kappa architectures for real-time and batch data processing workflows along with strong data modeling capabilities.
  • Ability to identify and implement the most appropriate data management system and enable integration capabilities for external tools to perform ingestion, compilation, analytics and visualization.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service