Luma AI, Inc.-posted about 1 month ago
Full-time • Mid Level
Palo Alto, CA
51-100 employees
Publishing Industries

As a Data Infrastructure Engineer at Luma, you will play a critical role in building and scaling the data infrastructure that supports our cutting-edge multimodal AI systems. Your work will focus on developing high-throughput, large-scale data processing pipelines tailored for machine learning research and internal ML platform needs. You will collaborate closely with ML researchers and product teams to create reliable, efficient, and easy-to-use data infrastructure that empowers innovation and accelerates development. This role requires a strong foundation in distributed systems and data engineering, with an emphasis on supporting complex machine learning workflows rather than traditional product data infrastructure.

  • Build and maintain scalable data infrastructure for high-throughput machine learning workflows
  • Collaborate with ML researchers and product teams to ensure data systems meet evolving needs
  • Develop and optimize large-scale data pipelines and batch processing jobs
  • Contribute to the architecture and implementation of reliable, high-performance data platforms
  • Integrate open-source tools and continuously improve data infrastructure through monitoring and tuning
  • Participate in cross-functional projects to improve data reliability, scalability, and operational excellence
  • Support the evaluation and adoption of new programming languages and frameworks relevant to data infrastructure
  • Engage in continuous improvement of data infrastructure through monitoring, troubleshooting, and performance tuning
  • Collaborate with research & engineering teams to help define and refine best practices for data infrastructure development
  • Proficiency in Python (or similar languages with willingness to learn Python) and experience with large-scale, high-throughput data infrastructure
  • Familiarity with distributed computing frameworks (e.g., Ray, Spark, Beam)
  • Ability to design and optimize data pipelines for ML research and internal teams
  • Strong problem-solving skills and understanding of data engineering at scale
  • Collaborative, product-focused mindset; comfortable in fast-paced environments
  • Experience sourcing, integrating, and optimizing data from diverse and large datasets
  • Comfortable working in a fast-paced, product-focused environment with a strong execution mindset
  • Open to candidates across seniority levels, from mid-level individual contributors to senior engineers and managers.
  • Prior experience working with complex data infrastructure or AI/ML platforms highly desirable
  • Experience with open source data infrastructure projects is a plus
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service