About the position
Patreon is seeking a Data Engineer to join their team in either their SF or NY offices or fully remote in the United States. The successful candidate will work on a tight-knit team of highly motivated and experienced data engineers, building core data sets and metrics to power analytics, reports, and experimentation. They will also write real-time and batch data pipelines to support a wide range of projects and features, help manage and build out Patreon's data platform and suite of data tools, and be a driver of a data-centric culture at Patreon. The ideal candidate will be an expert in SQL, Spark, and Python or Scala, with significant experience modeling data and developing core data sets and metrics to support analytics, reports, and experimentation.
Responsibilities
- Work on a tight-knit team of highly motivated and experienced data engineers with frequent collaboration with data scientists, product managers and product engineers.
- Work on both “data analytics” and “data infrastructure” type projects in a fast-paced, high-growth startup environment.
- Build core data sets and metrics to power analytics, reports and experimentation.
- Write real-time and batch data pipelines to support a wide range of projects and features including our creator-facing analytics product, executive reporting, FP&A, marketing initiatives, model training, data science analytics, A/B testing, etc.
- Help manage and build out our data platform and suite of data tools.
- Be a driver of a data-centric culture at Patreon. Work autonomously on large green field initiatives and help define data best practices at the company.
Requirements
- Expert in SQL, Spark and Python or Scala
- Significant experience modeling data and developing core data sets and metrics to support analytics, reports and experimentation. Solid understanding of how to use data to inform the product roadmap.
- Enjoy collaborating with Data Scientists, Product Managers and Product Engineers. Comfortable playing the role of a Project Manager in order to drive results.
- Have previously built real-time and batch data pipelines using tooling such as Airflow, Spark, Kafka, S3, Fivetran, Census, etc. Experience working with event tracking frameworks, data observability frameworks and experimentation frameworks.
- Experience managing and working with Data Warehouses and Data Lakes such as Redshift, Big Query, Snowflake, Delta Lake, etc.
- Highly motivated self-starter that is keen to make an impact and is unafraid of tackling large, complicated problems and putting in the work to ensure high craft deliverables.