AI/ML Data Engineer

College Board

4d•$137,000 - $148,000•Remote

About The Position

As an AI/ML Data Engineer, you’ll design, build, and operate the data and ML plumbing that powers personalized student experiences at scale. You’ll create batch and streaming pipelines, ML‑ready datasets, feature/embedding stores, and the services that move models into production safely and compliantly. You’ll collaborate with Product, Data Science, and Analytics to turn raw events into reliable, privacy‑preserving features that drive real impact for students and higher‑ed partners. In this role, you will: ML Data Platform & Pipelines (40%) Design, build, and own batch and streaming ETL (e.g., Kinesis/Kafka → Spark/Glue → Step Functions/Airflow) for training, evaluation, and inference use cases. Stand up and maintain offline/online feature stores and embedding pipelines (e.g., S3/Parquet/Iceberg + vector index) with reproducible backfills. Implement data contracts & validation (e.g., Great Expectations/Deequ), schema evolution, and metadata/lineage capture (e.g., OpenLineage/DataHub/Amundsen). Optimize lakehouse/warehouse layouts and partitioning (e.g., Redshift/Athena/Iceberg) for scalable ML and analytics. Model Enablement & LLM DataOps (30%) Productionize training and evaluation datasets with versioning (e.g., DVC/LakeFS) and experiment tracking (e.g., MLflow). Build RAG foundations: document ingestion, chunking, embeddings, retrieval indexing, and quality evaluation (precision@k, faithfulness, latency, and cost). Collaborate with DS to ship models to serving (e.g., SageMaker/EKS/ECS), automate feature backfills, and capture inference data for continuous improvement. Reliability, Security & Compliance (15%) Define SLOs and instrument observability across data and model services (freshness, drift/skew, lineage, cost, and performance). Embed security & privacy by design (PII minimization/redaction, secrets management, access controls), aligning with College Board standards and FERPA. Build CI/CD for data and models with automated testing, quality gates, and safe rollouts (shadow/canary). Documentation & Enablement (15%) Maintain docs‑as‑code for pipelines, contracts, and runbooks; create internal guides and tech talks. Mentor peers through design reviews, pair/mob sessions, and post‑incident learning.

Requirements

4+ years in data engineering (or 3+ with substantial ML productionization), with strong Python and distributed compute (Spark/Glue/Dask) skills.
Proven experience shipping ML data systems (training/eval datasets, feature or embedding pipelines, artifact/version management, experiment tracking).
MLOps/LLMOps: orchestration (Airflow/Step Functions), containerization (Docker), and deployment (SageMaker/EKS/ECS); CI/CD for data & models.
Expert SQL and data modeling for lakehouse/warehouse (Redshift/Athena/Iceberg), with performance tuning for large datasets.
Data quality & contracts (Great Expectations/Deequ), lineage/metadata (OpenLineage/DataHub/Amundsen), and drift/skew monitoring.
Cloud experience preferably with AWS services such as S3, Glue, Lambda, Athena, Bedrock, OpenSearch, API Gateway, DynamoDB, SageMaker, Step Functions, Redshift and Kinesis
BI tools like Tableau, Quicksight, or Looker for real-time analytics and dashboards
Security and privacy mindset; ability to design compliant pipelines handling sensitive student data.
An ability to judiciously evaluate the feasibility, fairness, and effectiveness of AI solutions and articulate considerations and concerns around implementing models in the context of specific business applications
Excellent communication, collaboration, and documentation habits.
A passion for expanding educational and career opportunities and mission-driven work
Authorization to work in the United States for any employer
Curiosity and enthusiasm for emerging technologies, with a willingness to experiment with and adopt new AI-driven solutions and a comfort learning and applying new digital tools independently and proactively.
Clear and concise communication skills, written and verbal
A learner's mindset and a commitment to growth: welcoming diverse perspectives, giving and receiving timely, respectful feedback, and continuously improving through iterative learning and user input.
A drive for impact and excellence: solving complex problems, making data-informed decisions, prioritizing what matters most, and continuously improving through learning, user input, and external benchmarking.
A collaborative and empathetic approach: working across differences, fostering trust, and contributing to a culture of shared success.

Nice To Haves

RAG & vector search experience (OpenSearch KNN/pgvector/FAISS) and prompt/eval frameworks.
Real‑time feature engineering (Kinesis/Kafka) and low‑latency stores for online inference.
Testing strategies for ML systems (unit/contract tests, data fuzzing, offline/online parity checks).
Experience in higher‑ed/assessments data domains.

Responsibilities

Design, build, and own batch and streaming ETL (e.g., Kinesis/Kafka → Spark/Glue → Step Functions/Airflow) for training, evaluation, and inference use cases.
Stand up and maintain offline/online feature stores and embedding pipelines (e.g., S3/Parquet/Iceberg + vector index) with reproducible backfills.
Implement data contracts & validation (e.g., Great Expectations/Deequ), schema evolution, and metadata/lineage capture (e.g., OpenLineage/DataHub/Amundsen).
Optimize lakehouse/warehouse layouts and partitioning (e.g., Redshift/Athena/Iceberg) for scalable ML and analytics.
Productionize training and evaluation datasets with versioning (e.g., DVC/LakeFS) and experiment tracking (e.g., MLflow).
Build RAG foundations: document ingestion, chunking, embeddings, retrieval indexing, and quality evaluation (precision@k, faithfulness, latency, and cost).
Collaborate with DS to ship models to serving (e.g., SageMaker/EKS/ECS), automate feature backfills, and capture inference data for continuous improvement.
Define SLOs and instrument observability across data and model services (freshness, drift/skew, lineage, cost, and performance).
Embed security & privacy by design (PII minimization/redaction, secrets management, access controls), aligning with College Board standards and FERPA.
Build CI/CD for data and models with automated testing, quality gates, and safe rollouts (shadow/canary).
Maintain docs‑as‑code for pipelines, contracts, and runbooks; create internal guides and tech talks.
Mentor peers through design reviews, pair/mob sessions, and post‑incident learning.