Hitachi-posted 4 months ago
Full-time • Senior
Dallas, TX
Electrical Equipment, Appliance, and Component Manufacturing

We're Hitachi Digital Services, a global digital solutions and transformation business with a bold vision of our world's potential. We're people-centric and here to power good. Every day, we future-proof urban spaces, conserve natural resources, protect rainforests, and lives. This is a world where innovation, technology, and deep expertise come together to take our company and customers from what's now to what's next. We make it happen through the power of acceleration. Imagine the sheer breadth of talent it takes to bring a better tomorrow closer to today. We don't expect you to 'fit' every requirement - your life experience, character, perspective, and passion for achieving great things in the world are equally as important to us. As a part of Cloud & Product Engineering Services practice, we are looking for talented ML Ops Tech Lead to join us to contribute to practice initiatives and keep current in the latest technology developments in the Cloud and Product Engineering space.

  • Build & Automate ML Pipelines: Design, implement, and maintain CI/CD pipelines for machine learning models, ensuring automated data ingestion, model training, testing, versioning, and deployment.
  • Operationalize Models: Collaborate closely with data scientists to containerize, optimize, and deploy their models to production, focusing on reproducibility, scalability, and performance.
  • Infrastructure Management: Design and manage the underlying cloud infrastructure (AWS) that powers our MLOps platform, leveraging Infrastructure-as-Code (IaC) tools to ensure consistency and cost optimization.
  • Monitoring & Observability: Implement comprehensive monitoring, alerting, and logging solutions to track model performance, data integrity, and pipeline health in real-time. Proactively address issues like model or data drift.
  • Governance & Security: Establish and enforce best practices for model and data versioning, auditability, security, and access control across the entire machine learning lifecycle.
  • Tooling & Frameworks: Develop and maintain reusable tools and frameworks to accelerate the ML development process and empower data science teams.
  • Overall 10+ years of experience with 4+ years of experience in MLOps, Machine Learning Engineering, or a related DevOps role with a focus on ML workflows.
  • Extensive hands-on experience in designing and implementing MLOps solutions on AWS. Proficient with core services like SageMaker, S3, ECS, EKS, Lambda, SQS, SNS, and IAM.
  • Strong coding proficiency in Python. Extensive experience with automation tools, including Terraform for IaC and GitHub Actions.
  • A solid understanding of MLOps and DevOps principles. Hands-on experience with MLOps frameworks like Sagemaker Pipelines, Model Registry, Weights and Bias, MLflow or Kubeflow and orchestration tools like Airflow or Argo Workflows.
  • Expertise in developing and deploying containerized applications using Docker and orchestrating them with ECS and EKS.
  • Experience with model testing, validation, and performance monitoring. Good understanding of ML frameworks like PyTorch or TensorFlow is required to effectively collaborate with data scientists.
  • Excellent communication and documentation skills, with a proven ability to collaborate with cross-functional teams (data scientists, data engineers, and architects).
  • Industry-leading benefits, support, and services that look after your holistic health and wellbeing.
  • Flexible arrangements that work for you (role and location dependent).
  • A sense of belonging, autonomy, freedom, and ownership as you work alongside talented people.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service