Software Engineer, Data

HeyGenLos Angeles, CA
1h$180,000 - $220,000

About The Position

At HeyGen, our mission is to make visual storytelling accessible to all. Over the last decade, visual content has become the preferred method of information creation, consumption, and retention. But the ability to create such content, in particular videos, continues to be costly and challenging to scale. Our ambition is to build technology that equips more people with the power to reach, captivate, and inspire audiences. Learn more at www.heygen.com . Visit our Mission and Culture doc here . Position Summary A Software Engineer with data engineering responsibilities to bridge the gap between core application development and large-scale data infrastructure. You will help build the data foundational layers for our next-generation features. This role is not just about moving data—it’s about enabling AI models to function in real-time, building robust pipelines for multimedia, and powering engaging user experiences. This team is currently working on cutting-edge features including PPT-to-video converters and interactive, conversational video capabilities.

Requirements

  • Bachelor’s/Master’s degree in Computer Science, Engineering, or a related field.
  • 3-5+ years of experience as a Backend Software Engineer with heavy data processing responsibilities.
  • Strong proficiency in Python (for ETL/scripting) and SQL (for data modeling).
  • Experience with cloud platforms (AWS/GCP) and data technologies like Kafka, Spark, and Snowflake/Databricks.
  • Proactive, "owner" mindset; ability to operate in a fast-paced, startup environment.

Nice To Haves

  • Experience or interest in Computer Vision/Generative AI data processing.

Responsibilities

  • Build & Scale Data Pipelines: Design, develop, and maintain robust batch and real-time data pipelines (using Python, Go, Spark, Kafka) that ingest and transform massive multi-modal data—text, audio, and video—to train and run AI models.
  • Power Intelligent Features: Collaborate with ML engineers to implement data structures and APIs for new, exciting features like PPT-to-video automation and interactive AI avatars that require low-latency data fetching.
  • Data Lakehouse Infrastructure: Architect and manage data lakehouse solutions (e.g., Snowflake, Databricks, Apache Iceberg) to store and query unstructured media data efficiently, enhancing storage and computation efficiency.
  • Data Reliability & Observability: Implement data quality checks, data contracts, and monitoring to ensure high reliability of data, preventing downtime in production video generation.
  • Productize Data: Transform raw data into structured, actionable data products that can be easily consumed by front-end applications, API endpoints, and AI agents.

Benefits

  • Competitive salary and benefits package.
  • Dynamic and inclusive work environment focused on innovation and creativity.
  • Opportunities for professional growth and skill development.
  • Collaborative culture that values teamwork and employee input.
  • Access to state-of-the-art technologies and tools.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service