About the position
Patreon is seeking a Data Engineer to join their team and support their mission of building a content and community platform where creators can engage directly with their fans and monetize their creativity. The ideal candidate will work on a tight-knit team of highly motivated and experienced data engineers, building core data sets and metrics to power analytics, reports, and experimentation. They will also write real-time and batch data pipelines to support a wide range of projects and features, and help manage and build out Patreon's data platform and suite of data tools. The role is available to those wishing to work in their SF and NY offices on a hybrid work model or those wishing to be fully remote in the United States.
Responsibilities
- Work on a tight-knit team of highly motivated and experienced data engineers with frequent collaboration with data scientists, product managers and product engineers.
- Work on both “data analytics” and “data infrastructure” type projects in a fast-paced, high-growth startup environment.
- Build core data sets and metrics to power analytics, reports and experimentation.
- Write real-time and batch data pipelines to support a wide range of projects and features including our creator-facing analytics product, executive reporting, FP&A, marketing initiatives, model training, data science analytics, A/B testing, etc.
- Help manage and build out our data platform and suite of data tools.
- Be a driver of a data-centric culture at Patreon. Work autonomously on large green field initiatives and help define data best practices at the company.
Requirements
- Expert in SQL, Spark and Python or Scala
- Significant experience modeling data and developing core data sets and metrics to support analytics, reports and experimentation. Solid understanding of how to use data to inform the product roadmap.
- Enjoy collaborating with Data Scientists, Product Managers and Product Engineers. Comfortable playing the role of a Project Manager in order to drive results.
- Have previously built real-time and batch data pipelines using tooling such as Airflow, Spark, Kafka, S3, Fivetran, Census, etc. Experience working with event tracking frameworks, data observability frameworks and experimentation frameworks.
- Experience managing and working with Data Warehouses and Data Lakes such as Redshift, Big Query, Snowflake, Delta Lake, etc.
- Highly motivated self-starter that is keen to make an impact and is unafraid of tackling large, complicated problems and putting in the work to ensure high craft deliverables.