Sr. ML Engineer

Blend360Columbia, MD
119d

About The Position

Blend is seeking an experienced Machine Learning Engineer with deep expertise in AWS-based ML pipelines, MLOps best practices, and infrastructure-as-code. This role is focused entirely on pipeline engineering and infrastructure optimization — no model training or research — and will play a critical part in refactoring mature ML systems to support upcoming business initiatives. The engineer will work closely with cross-functional data science, data engineering, and platform engineering teams to refactor, migrate, and scale production-grade ML pipelines that power recommender systems and lower-priority NLP applications. The ideal candidate will be comfortable with large-scale AWS-native environments, feature store integrations, and high-performance CI/CD workflows for ML.

Requirements

  • 5+ years of hands-on ML engineering experience (7+ preferred).
  • Proven success in AWS-based ML pipeline engineering at scale.
  • Core Technical Skills: AWS SageMaker, CloudFormation, Lambda, ECR, S3, DynamoDB.
  • Python for pipeline development, automation, and integration.
  • Snowflake data integration and optimization.
  • CI/CD in AWS CodePipeline for ML workflows.
  • MLOps best practices for production-grade pipelines.
  • Strong understanding of batch and streaming ML pipeline architectures.

Nice To Haves

  • Experience with recommender systems (primary use case).
  • Familiarity with NLP applications (secondary focus).
  • Ability to work independently on complex refactoring and migration projects.
  • Excellent collaboration skills with cross-functional teams.
  • Strong problem-solving and documentation capabilities.

Responsibilities

  • Redesign and refactor existing ML pipelines to improve scalability, maintainability, and operational efficiency.
  • Migrate pipelines to accommodate new input datasets that will drive updated models.
  • Ensure pipelines can handle both batch and streaming workloads.
  • Work with Tecton to manage and serve online/offline features for ML models.
  • Migrate legacy feature ingestion and retrieval processes to Tecton.
  • Develop and manage infrastructure using AWS CloudFormation and other IaC tools.
  • Leverage AWS services such as SageMaker, Lambda, ECR, S3, and DynamoDB for ML workflows.
  • Implement and maintain deployment pipelines using AWS CodePipeline.
  • Ensure seamless integration of ML workflows with SageMaker for training, inference, and monitoring.
  • Apply robust testing strategies, code coverage, and quality controls to all ML pipeline code.
  • Integrate ML pipelines with Snowflake, S3, and DynamoDB data sources.
  • Optimize data ingestion, transformation, and delivery to production models.
  • Identify and remediate technical debt in ML infrastructure.
  • Support migration of existing code to align with new feature store and data input requirements.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service