Machine Learning Engineer, Specialist

The Vanguard GroupMalvern, PA
Hybrid

About The Position

Supports and performs the development and programming of machine learning integrated software algorithms to structure, analyze, and leverage data in a production environment. Core Responsibilities Leverages data pipeline designs and supports the development of data pipelines to support model development. Proficient with software tools that develop data pipelines in a distributed computing environment (PySprak, GlueETL). Supports integration of model pipelines in a production environment. Develops understanding of SDLC for model production. Reviews pipeline designs, makes data model design changes as needed. Documents and reviews design changes with data science teams. Supports data discovery & automated ingestion for model development. Performs detailed analysis of raw data sources for data quality, applies business context, and model development needs. Engages with internal stakeholders to understand and probe business processes in order to develop hypotheses. Brings structure to requests and translates requirements into an analytic approach. Participates in and influences ongoing business planning and departmental prioritization activities. Runs model monitoring scripts, follows process for alerts to management as needed. Addresses issues found in data pipelines from model monitoring alerts. Participates in special projects and performs other duties as assigned.

Requirements

  • Undergraduate degree or equivalent experience; a graduate degree is preferred.
  • Minimum of 5 years of relevant work experience.
  • At least 3 years of hands-on experience designing ETL pipelines using AWS services (e.g., Glue, SageMaker).
  • Proficiency in programming languages, particularly Python (including PySpark, PySQL) and familiarity with machine learning libraries and frameworks.
  • Strong understanding of cloud technologies, including AWS and Azure, and experience with NoSQL databases.
  • Familiarity with Feature Store usage, LLMs, GenAI, RAG, Prompt Engineering, and Model Evaluation.
  • Solid understanding of software engineering principles, including design patterns, testing, security, and version control.
  • Knowledge of Machine Learning Development Lifecycle (MDLC) best practices and protocols.
  • Understanding of solution architecture for building end-to-end machine learning data pipelines.

Nice To Haves

  • Experience with API design and development is a plus.

Responsibilities

  • Leverages data pipeline designs and supports the development of data pipelines to support model development.
  • Proficient with software tools that develop data pipelines in a distributed computing environment (PySprak, GlueETL).
  • Supports integration of model pipelines in a production environment.
  • Develops understanding of SDLC for model production.
  • Reviews pipeline designs, makes data model design changes as needed.
  • Documents and reviews design changes with data science teams.
  • Supports data discovery & automated ingestion for model development.
  • Performs detailed analysis of raw data sources for data quality, applies business context, and model development needs.
  • Engages with internal stakeholders to understand and probe business processes in order to develop hypotheses.
  • Brings structure to requests and translates requirements into an analytic approach.
  • Participates in and influences ongoing business planning and departmental prioritization activities.
  • Runs model monitoring scripts, follows process for alerts to management as needed.
  • Addresses issues found in data pipelines from model monitoring alerts.
  • Participates in special projects and performs other duties as assigned.

Benefits

  • Vanguard is not offering visa sponsorship for this position.
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service