Machine Learning Platform - Lead Engineer

Allstate•McCullom Lake, IL

4d•$110,000 - $160,000

About The Position

The Allstate's Data & Analytics Technology organization is seeking a Machine Learning Platform Lead Engineer to architect, build, and scale the core platforms that power enterprise-wide machine learning solutions. In this role, you will provide deep technical leadership across ML infrastructure, MLOps automation, model deployment systems, and cloud-native engineering. You will influence platform strategy, guide architectural decisions, and collaborate closely with data science, engineering, security, and product teams to enable reliable, scalable, and responsible ML adoption across the enterprise. This role is ideal for a senior technologist who thrives in hands-on engineering, technical leadership, and building high-impact ML platform capabilities.

Requirements

Extensive experience in ML engineering, platform engineering, or large-scale distributed systems.
Deep hands-on expertise with MLOps tools, ML frameworks, model deployment techniques, and ML lifecycle automation.
Strong proficiency in Python and backend development for machine learning systems.
Experience with cloud platforms and ML services, including Azure ML Studio, AWS SageMaker, and/or Google Vertex AI.
Exposure to cloud storage/data such as Azure Fabric/OneLake, AWS S3, and Google Cloud Storage (GCS).
Experience with cloud-native scanning and security tools such as Azure Defender, Microsoft Purview, AWS Security Hub, Amazon Inspector, GCP Security Command Center, or equivalent services.
Strong understanding of technologies such as Kubernetes, Docker, CI/CD, Terraform/Infrastructure-as-Code, etc.
Solid knowledge of system design, APIs, data pipelines, and scalable ML infrastructure patterns.
Proven ability to lead technical initiatives and influence cross‑team engineering decisions.
6+ years of related experience (preferred).

Responsibilities

Serve as the technical lead for ML platform architecture, guiding system design, scalability, performance, and reliability across platform components.
Architect and build core ML platform services, including training and compute infrastructure, feature stores, model registries, inference runtimes, and data pipelines.
Drive architectural decisions for distributed systems, cloud‑native frameworks, and automated MLOps workflows that support enterprise-scale machine learning.
Evaluate and integrate emerging ML platform technologies, tools, and best practices to continuously strengthen platform capabilities.
Design and implement robust MLOps pipelines for experiment tracking, data and model versioning, CI/CD for ML, automated retraining, and model governance.
Develop automated workflows that ensure reproducible model training, validation, deployment, and lifecycle management across multiple environments.
Implement monitoring and observability systems for model performance, data quality, drift detection, and inference reliability.
Build and optimize cloud-based ML infrastructure on Azure, AWS, or GCP using Kubernetes, containerization, and infrastructure‑as‑code.
Develop scalable batch and streaming data pipelines using modern data engineering tools and frameworks.
Embed security, compliance, responsible AI principles, and cost optimization best practices within ML platform architecture and operations.
Collaborate with data scientists to translate modeling needs into scalable, reusable, and self-service platform capabilities.
Work closely with security, compliance, and governance teams to ensure safe and compliant deployment of AI/ML solutions.
Partner with application engineering teams to accelerate adoption of ML services and enable consistent, high-quality production deployments.
Provide technical mentorship, set engineering standards, and contribute to documentation, best practices, and ongoing platform improvements.