ML Data Engineer (Contract-to-hire)

Potomac•Bethesda, MD

50d

About The Position

Potomac is continuing to invest in modern data and AI capabilities to support our growing business. We are seeking a Machine Learning Data Engineer to join our team and play a critical role in building and scaling our data infrastructure. This role will focus on designing and maintaining data pipelines, enabling machine learning and analytics use cases, and ensuring high-quality, well-governed data is available across the organization. This position will work closely with Operations, Technology, Analytics, and business stakeholders to translate data needs into reliable, production-ready data solutions.

Requirements

Bachelor’s degree in Computer Science, Data Engineering, Engineering, or a related field (or equivalent experience).
4+ years of experience in data engineering or related roles.
Strong proficiency in Python and SQL.
Hands-on experience building and operating data pipelines and workflows.
Experience with modern data platforms (data lakes, data warehouses, or lakehouse architectures).
Familiarity with orchestration tools (e.g., Airflow, Dagster, Prefect) and data transformation frameworks.
Solid understanding of data modeling, schema design, and data quality best practices.
Experience integrating data from APIs and third-party systems.
Strong problem-solving skills and ability to work independently in a fast-paced environment.
Excellent communication skills and ability to work with both technical and non-technical stakeholders.

Nice To Haves

Experience supporting machine learning workflows (feature engineering, training datasets, or ML pipelines).
Familiarity with cloud platforms (AWS, Azure, or GCP).
Experience with streaming or near–real-time data pipelines.
Knowledge of data governance, security, and compliance best practices.
Prior experience in financial services, fintech, or regulated data environments.
Experience working in a high-growth or startup environment.

Responsibilities

Design, build, and maintain scalable data pipelines to ingest data from multiple internal and external sources (APIs, SaaS platforms, databases, files).
Develop and manage a centralized data lake / lakehouse to standardize and curate data for analytics, reporting, and machine learning use cases.
Implement ELT/ETL processes to clean, validate, transform, and model data into trusted datasets.
Build and maintain machine-learning–ready datasets and feature pipelines that support experimentation and production models.
Ensure data quality, freshness, and reliability through monitoring, alerting, and automated validation checks.
Partner with analytics and business teams to define data requirements, metrics, and reporting outputs.
Support downstream data consumption for BI tools, dashboards, operational reporting, and partner data exports.
Apply best practices around data governance, security, access controls, and documentation.
Collaborate cross-functionally to deliver scalable, maintainable data solutions aligned with business priorities.
Continuously improve performance, cost efficiency, and reliability of the data platform.