Lead Data Scientist

AppOmni

10d•$210,000 - $240,000

About The Position

AppOmni is looking for a Lead Data Scientist to help define and build scalable, production-grade data pipelines and intelligent analytics capabilities within our SaaS platform. In this role, you will apply data science, statistical modeling, batch and real-time analytics, and large-scale data engineering to transform complex datasets into actionable product insights and customer-facing capabilities. You will work across a broad range of technical domains on pipelines, including ETL, statistical modeling, machine learning (supervised and unsupervised) and LLM as well as monitoring, governance, visualization, and production modeling systems. We are looking for a highly versatile engineer-scientist — someone who has worked across different layers of the modern data stack and enjoys continuing to solve a wide variety of technical problems. This role is ideal for someone whose background spans data engineering, infrastructure, analytics applications, statistical modeling, and operational production systems. You will be responsible for end-to-end data workflows, from ingestion and transformation through analytics implementation, orchestration, monitoring, governance, and production operations. This is a hands-on individual contributor role with technical leadership responsibilities, partnering closely with Product and Engineering to build reliable, scalable, and intelligent data-driven systems.

Requirements

7–10+ years of experience as a Data Scientist, Applied Scientist, Data Engineer, or Machine Learning Engineer, with ownership of production systems.
Strong experience building and operating large-scale data pipelines and distributed data processing systems.
Hands-on experience within the GCP ecosystem, particularly big data services such as Dataproc, Dataflow, PubSub, and related storage and data lake technologies.
Strong proficiency in Python, PySpark, and modern data processing frameworks.
Experience working across multiple disciplines of the data stack, including data engineering, analytics, infrastructure, monitoring/governance, APIs, and visualization.
Experience with real-time or streaming systems and orchestration frameworks such as Airflow and Apache Beam/Dataflow.
Strong foundation in statistical modeling, analytics, and applied data science techniques.
Experience designing and maintaining scalable ETL workflows and production data infrastructure.
Familiarity with monitoring, observability, governance, and reliability practices for production data systems.
Ability to thrive in highly cross-functional environments and contribute across a wide range of technical challenges.
Demonstrated versatility — a background that spans multiple types of data applications, infrastructure, and analytics work is highly valued.
Experience partnering closely with Product and Engineering to deliver customer-facing capabilities.
Strong written and verbal communication skills.

Responsibilities

Design and implement scalable batch and real-time data processing systems across large and complex datasets.
Build and optimize ETL and streaming data pipelines using modern GCP big data technologies.
Lead development decisions around model choices, data architecture, data modeling, pipeline orchestration, analytics infrastructure, and production systems.
Develop statistical models and analytics capabilities that support product intelligence and operational insights.
Design and maintain production-grade data workflows using technologies such as Airflow, Dataflow, PubSub, and PySpark.
Contribute across multiple areas of the data ecosystem, including data engineering, monitoring and governance, visualization, and analytics tooling.
Establish monitoring, observability, and governance practices for data quality, pipeline reliability, and production health.
Partner closely with Engineering to operationalize scalable data infrastructure and analytics systems.
Collaborate with Product to shape intelligent, data-driven product capabilities and user experiences.
Act as a technical leader and thought partner across data engineering, analytics, infrastructure, and applied modeling initiatives.
Help evolve internal tooling and frameworks that improve scalability, reliability, and operational efficiency across the platform.

Benefits

Generous paid time off
paid company holidays
paid floating holidays
paid parental leave
paid sick time
paid family leave for applicable states
health insurance - medical, dental, and vision with HSA option
LifeWorks Employee Assistance Program
company-provided life insurance, AD&D, STD/LTD and additional supplemental life insurance options
401(k) and Roth retirement saving accounts
monthly wellness benefit reimbursement
Stock Options