This job is closed
We regret to inform you that the job you were interested in has now been closed. Although this specific position is no longer available, we encourage you to continue exploring other opportunities on our job board.
About the position
We are seeking a Data Engineer to join our team at OpenAI. In this role, you will be responsible for designing, building, and managing our data pipelines, as well as developing canonical datasets to track key product metrics. You will collaborate with various teams to understand their data needs and provide solutions, and ensure the security, integrity, and compliance of data. The ideal candidate should have experience in data engineering, proficiency in programming languages such as Python or Java, and familiarity with distributed processing technologies and frameworks. This position is based in our San Francisco headquarters with relocation assistance available.
- Design, build and manage data pipelines to integrate user event data into the data warehouse.
- Develop canonical datasets to track key product metrics such as user growth, engagement, and revenue.
- Collaborate with various teams to understand their data needs and provide solutions.
- Implement robust and fault-tolerant systems for data ingestion and processing.
- Participate in data architecture and engineering decisions.
- Ensure the security, integrity, and compliance of data according to industry and company standards.
- 3+ years of experience as a data engineer and 8+ years of any software engineering experience (including data engineering)
- Proficiency in at least one programming language commonly used within Data Engineering, such as Python, Scala, or Java
- Experience with distributed processing technologies and frameworks, such as Hadoop, Flink, and distributed storage systems (e.g., HDFS, S3)
- Expertise with any of ETL schedulers such as Airflow, Dagster, Prefect, or similar frameworks
- Solid understanding of Spark and ability to write, debug, and optimize Spark code
- Ability to design, build, and manage data pipelines
- Experience in developing canonical datasets to track key product metrics
- Collaborative mindset and ability to work with various teams
- Knowledge of data architecture and engineering decisions
- Strong focus on data security, integrity, and compliance according to industry and company standards
- Willingness to work exclusively in San Francisco HQ with relocation assistance available
- Medical, dental, and vision insurance for you and your family
- Mental health and wellness support
- 401(k) plan with 4% matching
- Unlimited time off and 18+ company holidays per year
- Paid parental leave (20 weeks) and family-planning support
- Annual learning & development stipend ($1,500 per year)
Dev & Engineering
This is some text inside of a div block.