AWS DevOps Engineer (With deep expertise in MLOps)

Toyota North AmericaPlano, TX
2d

About The Position

Overview Who we are Collaborative. Respectful. A place to dream and do. These are just a few words that describe what life is like at Toyota. As one of the world’s most admired brands, Toyota is growing and leading the future of mobility through innovative, high-quality solutions designed to enhance lives and delight those we serve. We’re looking for talented team members who want to Dream. Do. Grow. with us. An important part of the Toyota family is Toyota Financial Services (TFS), the finance and insurance brand for Toyota and Lexus in North America. While TFS is a separate business entity, it is an essential part of this world-changing company- delivering on Toyota's vision to move people beyond what's possible. At TFS, you will help create best-in-class customer experience in an innovative, collaborative environment. Who we are Collaborative. Respectful. A place to dream and do. These are just a few words that describe what life is like at Toyota. As one of the world’s most admired brands, Toyota is growing and leading the future of mobility through innovative, high-quality solutions designed to enhance lives and delight those we serve. We’re looking for diverse, talented team members who want to Dream. Do. Grow. with us. To save time applying, Toyota does not offer sponsorship of job applicants for employment-based visas or any other work authorization for this position at this time. Who We’re Looking For Toyota Financial Services is looking for a seasoned AWS DevOps Engineer with deep expertise in MLOps to design, deploy, and support a scalable SageMaker platform that accelerates the end-to-end machine learning lifecycle. This role will empower data scientists by enabling seamless model development, versioning, deployment, and monitoring in production. The ideal candidate will manage AWS infrastructure, enforce enterprise security, and integrate Single Sign-On (SSO) with Okta, while supporting a broad range of AWS services and automating infrastructure as code.

Requirements

  • 5+ years of hands-on experience in AWS cloud infrastructure and DevOps engineering with a strong focus on MLOps.
  • Expertise in AWS services: SageMaker (including SageMaker Pipelines, Model Registry), ECS Fargate, EC2, ALB, S3, DynamoDB, OpenSearch, AWS Bedrock.
  • Proven experience in productionizing ML models, managing model versioning, lineage, and lifecycle.
  • Strong skills in Infrastructure as Code using Terraform and OpenTofu.
  • Experience designing and implementing CI/CD pipelines for ML workflows and Python microservices.
  • Deep understanding of AWS security best practices, IAM policies, Guardrails, and enterprise security frameworks.
  • Experience integrating Single Sign-On (SSO) solutions using Okta and Mulesoft proxy.
  • Familiarity with ML lifecycle management tools and frameworks.
  • Strong scripting and automation skills.
  • Excellent problem-solving, collaboration, and communication skills.

Nice To Haves

  • AWS certifications (e.g., AWS Certified DevOps Engineer, AWS Certified Machine Learning Specialty).
  • Experience with container orchestration, microservices, and serverless architectures.
  • Knowledge of monitoring, logging, and alerting tools for ML models and cloud infrastructure.
  • Familiarity with open-source MLOps tools (e.g., MLflow, Kubeflow) is a plus.

Responsibilities

  • Design, deploy, and maintain a robust AWS SageMaker platform to support the full ML lifecycle: data preparation, model training, tuning, deployment, monitoring, and retraining.
  • Collaborate closely with data scientists to productionize machine learning models, ensuring scalability, reliability, and performance.
  • Implement model versioning, lineage tracking, and governance to support reproducibility and auditability.
  • Build and maintain MLOps pipelines that automate continuous integration, continuous delivery (CI/CD), and continuous training (CT) of ML models.
  • Manage AWS infrastructure including EC2, ECS Fargate, ALB, S3, DynamoDB, OpenSearch, and AWS Bedrock to support AI/ML workloads.
  • Enforce enterprise security best practices using IAM, Guardrails, and AWS security services.
  • Configure and manage Single Sign-On (SSO) integration with Okta and Mulesoft proxy for secure platform access.
  • Automate infrastructure provisioning and management using Infrastructure as Code (IaC) tools such as Terraform and OpenTofu.
  • Monitor deployed models and infrastructure for performance, drift, and anomalies; implement alerting and remediation workflows.
  • Support containerized microservices architecture with Python-based services, establishing CI/CD pipelines for rapid deployment.
  • Stay current with AWS services, MLOps frameworks, AI ecosystem trends, and DevOps best practices.

Benefits

  • A work environment built on teamwork, flexibility, and respect
  • Professional growth and development programs to help advance your career, as well as tuition reimbursement
  • Team Member Vehicle Purchase Discount.
  • Toyota Team Member Lease Vehicle Program (if applicable).
  • Comprehensive health care and wellness plans for your entire family
  • Toyota 401(k) Savings Plan featuring a company match, as well as an annual retirement contribution from Toyota regardless of whether you contribute
  • Paid holidays and paid time off

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Mid Level

Education Level

No Education Listed

Number of Employees

5,001-10,000 employees

© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service