About the position
Spotter is a company that supports YouTube creators by providing them with the necessary resources to grow their businesses. They offer creator-friendly growth capital, allowing creators to reinvest in their content and have control over their catalogs and future earnings. In addition to funding, Spotter provides creators with data insights to optimize their content performance. The company has already invested a significant amount of capital in YouTube creators and has a large licensed content library. They are currently seeking a Data Engineer with experience in scalable data pipelines, big data technologies, and cloud services to join their team in Los Angeles.
Responsibilities
- Develop and maintain scalable data pipelines, including ETL pipelines, data quality assurance steps, creating derived datasets, and working on analytics ready datasets.
- Troubleshoot issues with data pipelines and work directly with internal data consumers.
- Automate pipeline runs with scheduling and orchestration tools.
- Work with large scale datasets and use various external APIs to enhance data.
- Setup database tables for analytics users to consume the collected data.
- Work with big data technologies to improve data availability and quality in the cloud.
- Mentor more junior members of the team.
- Have a Bachelor's degree in Computer Science or Computer Information Systems.
- Have 5+ years of software engineering experience and 3+ years of data engineering experience with Apache Spark or Apache Flink.
- Have 3+ years of experience running software and services in the cloud.
- Proficiency in working with DataFrame APIs (Pandas and Spark) for parallel and single node processing.
- Proficiency using advanced languages and techniques with Python, Scala, etc., and modern data optimized file formats.
- Proficiency with SQL on RDBMS and data warehouse solutions.
- Additional valued skills include experience with YouTube APIs, data acquisition from external APIs at large scale, Data-Lake technologies, AWS Glue metastore, Data-Mesh approaches, and data cataloging, data lineage, and data governance tools and approaches.
Requirements
- Bachelor's degree, preferably in Computer Science or Computer Information Systems
- 5+ years of software engineering experience
- 3+ years of data engineering experience with Apache Spark or Apache Flink
- 3+ years of experience running software and services in the cloud
- Proficiency in working with DataFrame APIs (Pandas and Spark) for parallel and single node processing
- Proficiency using advanced languages and techniques with Python, Scala, etc. with modern data optimized file formats such as Parquet and Avro
- Proficiency with SQL on RDBMS and data warehouse solutions like Redshift
- Experience with YouTube APIs
- Experience with data acquisition from external APIs at large scale / in parallel processing
- Experience with Data-Lake technologies
- Experience with AWS Glue metastore
- Experience with Data-Mesh approaches
- Experience with data cataloging, data lineage, and data governance tools and approaches
Benefits
- Medical insurance covered up to 100%
- Dental & vision insurance
- 401(k) matching
- Stock options
- Autonomy and upward mobility
- Diverse, equitable, and inclusive culture, where your voice matters.