Data Engineer, Security Products

OpenAI•San Francisco, CA

30d

About The Position

Security Products is on a mission to transform cybersecurity by leveraging AI to give defenders a decisive advantage. Born out of OpenAI’s security research program and incubated with Sam Altman’s encouragement, the Security Products team is taking the breakthroughs we’ve developed internally and bringing them to the world. Our tools are already protecting OpenAI’s own systems, and we’re looking to scale their impact to the world. We are a lean, high-caliber team operating at the frontier of AI and security. We blend security research with product engineering to explore what’s possible, build what’s practical, and give defenders an unprecedented advantage – all while advancing the science of AI-driven security itself. As a Data Engineer on this team, you’ll help invent and build the next generation of AI-powered cybersecurity products. You’ll take the lead in building our data pipelines, core tables, and analytics to ensure our products are robust, reliable, and impactful. This is a fast-moving, high-ownership role. You’ll collaborate closely with security experts, AI researchers, and product engineers, shaping both the technical foundation and the product direction of a team that is scaling quickly. You don’t need to be a security expert to succeed here – just someone excited to solve meaningful problems and build tools that materially improve defenders’ capabilities.

Requirements

Have 3+ years of experience as a data engineer and 8+ years of any software engineering experience (including data engineering).
Proficiency in at least one programming language commonly used within Data Engineering, such as Python, Scala, or Java.
Familiarity with public cloud environments (Azure or AWS) and infrastructure tooling (Kubernetes, Terraform, etc.).
Experience with distributed processing technologies and frameworks, such as Hadoop, Flink and distributed storage systems (e.g., HDFS, S3).
Expertise with any of ETL schedulers such as Airflow, Dagster, Prefect or similar frameworks.
Solid understanding of Spark and ability to write, debug and optimize Spark code.
Ability to thrive in a fast-paced environment where problems may be ambiguous, emerging, or rapidly evolving.

Responsibilities

Design, build and manage our data pipelines, ensuring all user event data is seamlessly integrated into our data warehouse.
Develop canonical datasets to track key product insights including security outcomes, engagement and usage, and performance.
Work directly with internal and external customers to deeply understand their workflows and translate them into intuitive, powerful product experiences.
Work collaboratively with cross-functional partners, including Infrastructure, Data Science, Product, Marketing, Go-To-Market, and Research to solve problems and enable product success.
Implement robust and fault-tolerant systems for data ingestion and processing.
Participate in data architecture and engineering decisions, bringing your strong experience and knowledge to bear.
Ensure the security, integrity, and compliance of data according to industry and company standards.
Help shape the engineering culture, architecture, and processes of a new business unit with a critical mission.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume