Data Engineer - AI Practice Team

American Bureau of Shipping•Houston, TX

1d•$95,000 - $150,000

About The Position

ABS is seeking an exceptional Data Engineer to join us full-time on our Artificial Intelligence (AI) Practice Team. In this role, you will design and operate the data foundations that power AI chat assistants, custom AI models, and AI-driven process optimization for ABS Consulting clients. You will build robust pipelines that integrate structured and unstructured data, standardize and tag enterprise content, and enable scalable, low-latency retrieval for AI workloads. Location: This position can be based in Houston, Texas; Knoxville, Tennessee; or Washington, DC.

Requirements

Bachelor’s degree in Computer Science, Data Engineering, Information Systems, or a closely related technical field; Master’s degree preferred.
6+ years of professional data engineering experience designing, building, and operating production data solutions.
Demonstrated experience working in data-intensive environments (e.g., analytics platforms, AI/ML workloads, large-scale content repositories, or enterprise data platforms).
Hands-on experience delivering solutions on at least one major cloud provider (AWS, Azure, or Google Cloud), including managed data and analytics services.
Strong command of SQL and at least one programming language commonly used in data engineering (Python preferred) for building production-grade data pipelines.
Hands-on experience with modern data processing frameworks and platforms (e.g., Spark, Databricks, Snowflake, BigQuery, Synapse, or similar).
Proficiency with ETL/ELT orchestration tools and workflows (e.g., Airflow, dbt, Azure Data Factory, AWS Glue, or equivalent).
Experience designing and operating data lakes/lakehouses and integrating multiple data sources (relational, NoSQL, files, APIs) into cohesive data models.
Deep experience working with unstructured and semi-structured data (documents, PDFs, JSON, logs), including content extraction, normalization, and metadata/tagging.
Familiarity with AI/ML data patterns, including feature engineering, embeddings, vector databases, and retrieval-augmented generation (RAG) pipelines.
Strong understanding of data modeling, data quality, data governance, and lineage practices for regulated or compliance-sensitive environments.
Proficiency with cloud-native data services (e.g., S3/ADLS/GCS, managed warehouses, streaming services like Kafka/Kinesis/Event Hubs).
Solid grounding in software engineering best practices (version control, CI/CD, testing, code review) as applied to data engineering.

Responsibilities

Design, build, and maintain scalable ETL/ELT pipelines to ingest, clean, and transform structured and unstructured data for AI assistants and custom models.
Integrate diverse knowledge repositories (documents, policies, procedures, standards, databases) into centralized data platforms that support retrieval-augmented generation (RAG) and search.
Implement data standardization, normalization, and tagging pipelines to align content with enterprise taxonomies and ontologies.
Collaborate with AI/ML engineers to productionize model-ready datasets, feature stores, and embeddings for prediction, classification, and optimization use cases.
Optimize data workflows for reliability, cost, and performance across batch and streaming workloads, including monitoring, alerting, and capacity planning.
Establish and enforce data quality, lineage, and governance practices to ensure trustworthy inputs to AI systems and process-automation solutions.
Automate and templatize common data engineering patterns to accelerate delivery across multiple client engagements and industry domains.
Partner with consultants and business stakeholders to translate process optimization and analytics requirements into robust, maintainable data solutions.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume