Data Engineer - AI Practice Team

American Bureau of ShippingHouston, TX
1d$95,000 - $150,000

About The Position

ABS is seeking an exceptional Data Engineer to join us full-time on our Artificial Intelligence (AI) Practice Team. In this role, you will design and operate the data foundations that power AI chat assistants, custom AI models, and AI-driven process optimization for ABS Consulting clients. You will build robust pipelines that integrate structured and unstructured data, standardize and tag enterprise content, and enable scalable, low-latency retrieval for AI workloads. Location: This position can be based in Houston, Texas; Knoxville, Tennessee; or Washington, DC.

Requirements

  • Bachelor’s degree in Computer Science, Data Engineering, Information Systems, or a closely related technical field; Master’s degree preferred.
  • 6+ years of professional data engineering experience designing, building, and operating production data solutions.
  • Demonstrated experience working in data-intensive environments (e.g., analytics platforms, AI/ML workloads, large-scale content repositories, or enterprise data platforms).
  • Hands-on experience delivering solutions on at least one major cloud provider (AWS, Azure, or Google Cloud), including managed data and analytics services.
  • Strong command of SQL and at least one programming language commonly used in data engineering (Python preferred) for building production-grade data pipelines.
  • Hands-on experience with modern data processing frameworks and platforms (e.g., Spark, Databricks, Snowflake, BigQuery, Synapse, or similar).
  • Proficiency with ETL/ELT orchestration tools and workflows (e.g., Airflow, dbt, Azure Data Factory, AWS Glue, or equivalent).
  • Experience designing and operating data lakes/lakehouses and integrating multiple data sources (relational, NoSQL, files, APIs) into cohesive data models.
  • Deep experience working with unstructured and semi-structured data (documents, PDFs, JSON, logs), including content extraction, normalization, and metadata/tagging.
  • Familiarity with AI/ML data patterns, including feature engineering, embeddings, vector databases, and retrieval-augmented generation (RAG) pipelines.
  • Strong understanding of data modeling, data quality, data governance, and lineage practices for regulated or compliance-sensitive environments.
  • Proficiency with cloud-native data services (e.g., S3/ADLS/GCS, managed warehouses, streaming services like Kafka/Kinesis/Event Hubs).
  • Solid grounding in software engineering best practices (version control, CI/CD, testing, code review) as applied to data engineering.

Responsibilities

  • Design, build, and maintain scalable ETL/ELT pipelines to ingest, clean, and transform structured and unstructured data for AI assistants and custom models.
  • Integrate diverse knowledge repositories (documents, policies, procedures, standards, databases) into centralized data platforms that support retrieval-augmented generation (RAG) and search.
  • Implement data standardization, normalization, and tagging pipelines to align content with enterprise taxonomies and ontologies.
  • Collaborate with AI/ML engineers to productionize model-ready datasets, feature stores, and embeddings for prediction, classification, and optimization use cases.
  • Optimize data workflows for reliability, cost, and performance across batch and streaming workloads, including monitoring, alerting, and capacity planning.
  • Establish and enforce data quality, lineage, and governance practices to ensure trustworthy inputs to AI systems and process-automation solutions.
  • Automate and templatize common data engineering patterns to accelerate delivery across multiple client engagements and industry domains.
  • Partner with consultants and business stakeholders to translate process optimization and analytics requirements into robust, maintainable data solutions.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service