Purchasing Power-posted 3 months ago
Full-time • Mid Level
Hybrid • Atlanta, GA
251-500 employees
Personal and Laundry Services

The Data Engineer will build, scale, and optimize real-time data systems and pipelines using Apache Kafka (Confluent), AWS, and a modern data stack. Work hands-on with streaming, ETL, distributed infrastructure, and PostgreSQL to fuel analytics and product innovation while deploying AI/ML frameworks, agentic automation, and MLOps tools to enable powerful analytics, advanced modeling, and a responsive data infrastructure.

  • Architect and build real-time streaming pipelines with Kafka, Confluent Schema Registry, and Zookeeper, ensuring scalable, event-driven data platforms
  • Leverage AWS services: build and manage ETL/ELT workflows on Glue and EMR, deploy scalable workloads using EC2, and orchestrate storage in S3
  • Optimize and maintain PostgreSQL and other databases: schema design, advanced SQL, and performance tuning
  • Integrate AI/ML tools and frameworks (TensorFlow, PyTorch, Hugging Face) into data workflows; design pipelines to prepare and serve data for training and inference
  • Automate data quality checks, feature extraction, and anomaly detection using AI-powered data validation and observability tools
  • Collaborate with ML engineers to deploy, monitor, and continuously improve machine learning models within production data pipelines (batch and real-time), leveraging MLOps platforms (e.g., MLflow, SageMaker, Airflow, Kubeflow)
  • Experiment with vector databases and retrieval-augmented generation (RAG) pipelines to support LLM and GenAI initiatives
  • Build and optimize event-driven and cloud-native architectures, supporting scalable, reliable AI data products
  • Bachelor's degree in Computer Science, Engineering, Mathematics, or related technical field
  • 3+ years of Data Engineering experience with hands-on Kafka (Confluent/OSS) and AWS experience
  • Hands-on with automated data quality, monitoring, and observability tools for AI/data workflows
  • Advanced SQL and strong database fundamentals in PostgreSQL and other traditional and No-SQL databases
  • Proficiency in either Python, Scala, or Java for workflow development and AI integrations
  • Proficient with synthetic data generation, vector stores, or GenAI data products
  • Experience integrating ML models into data pipelines, using frameworks like PyTorch, TensorFlow, and MLOps platforms (Airflow, MLflow, SageMaker, Kubeflow)
  • Hybrid work model (Onsite/Offsite)
  • Comprehensive benefits: medical, dental, vision, company paid Basic Life/AD&D
  • 401k Retirement Plan
  • Flexible PTO
  • Career Development
  • Employee Purchase Program
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service