Staff Software Engineer - Data Platform

ID.me•Mountain View, CA

8h•Onsite

About The Position

ID.me is seeking a Staff Data Engineer (SDE-V) to lead the design, build, and operation of the core data infrastructure that underpins our identity platform. This engineer will be responsible for ensuring the reliability, scalability, and performance of the systems that move, process, and store data across the company. In this role, you’ll own and operate key data infrastructure components — including event streaming platforms, relational databases, and batch processing systems — while driving automation and engineering best practices that improve data platform reliability and developer efficiency. You’ll partner closely with Platform Engineering, Site Reliability Engineering, and Compliance teams to ensure ID.me’s data ecosystem meets demanding operational, security, and regulatory requirements. This is a hands-on technical leadership role for a data infrastructure engineer who thrives at the intersection of distributed systems, platform engineering, and data operations. This role is based out of our Mountain View, CA or McLean, VA offices and requires full-time in-office attendance.

Requirements

Bachelor’s or Graduate degree in Computer Science, Software Engineering, or a related technical field.
8+ years of professional experience in data engineering, software engineering, or distributed systems development.
6+ years of programming experience in one or more languages such as Go, Python, or Java, with emphasis on automation and data system integration.

Nice To Haves

Deep expertise in building and operating data systems—including relational databases, streaming, and batch platforms—in production environments.
Hands-on experience administering and optimizing PostgreSQL or other relational databases in the cloud (AWS RDS, CloudSQL, or AlloyDB).
Solid understanding of reliability engineering principles, including observability, SLOs, capacity management, and operational readiness.
Experience managing cloud infrastructure (AWS or GCP) using infrastructure-as-code tools like Terraform, Kubernetes, or Helm.
Experience operating event streaming platforms such as Kafka or Google Pub/Sub.
Experience with batch and stream processing systems, including Dataflow, Temporal, or Airflow.
Strong knowledge of data pipeline orchestration, change data capture, and schema management.
Background in automation, incident response, and data platform observability.
Familiarity with data governance and regulatory compliance frameworks (e.g., FedRAMP, GDPR, NIST).
Contributions to open-source data infrastructure projects or strong community engagement in the data reliability space.
Passion for performance engineering, system design, and mentoring others to deliver operational excellence at scale.
AI-assisted development — Demonstrable experience using AI developer tools (e.g., code generation, test generation, query synthesis) to accelerate platform automation while validating outputs through code review and tests.
Data-aware LLM usage — Ability to safely use large language models for tasks such as SQL generation, data lineage summarization, and runbook drafting while ensuring no sensitive data is exposed to external models and all prompts and outputs are logged for audit.

Responsibilities

Own and operate core data infrastructure, including event streaming, relational database, and batch processing platforms.
Design and implement highly reliable, observable, and scalable data systems that enable real-time and batch data processing.
Develop automation and guardrails for data governance, retention, and compliance, ensuring auditability and consistency across services.
Partner with application, platform, and SRE teams to improve data access patterns, reliability SLAs, and recovery processes.
Establish standards for data infrastructure monitoring, alerting, and capacity planning, ensuring proactive issue detection.
Drive operational excellence by improving resilience, reducing toil, and implementing self-healing or automated recovery mechanisms.
Evolve and optimize data pipelines that support downstream analytics, identity verification, and machine learning systems.
Evaluate, implement, and operate event-driven and batch data platforms such as Kafka, Google Pub/Sub, Dataflow, or Temporal.
Lead incident response and root cause analysis for production data systems, contributing to postmortems and platform improvements.
Mentor engineers and advocate for reliability-focused engineering culture across teams.
Data lake architecture — Design and build the data lake storage and compute topology (object storage, partitioning, lifecycle, tiering) to support batch and streaming workloads.

Benefits

comprehensive medical, dental, vision
health savings account
flexible spending accounts (medical, limited purpose, dependent care, commuter benefit accounts)
basic and voluntary life and AD&D insurance
401(k) with company match
parental leave
ability to participate in unlimited paid time off subject to the terms and conditions of the PTO policy, including 8 company wide holidays
short and long-term disability insurance
accident and critical illness insurance
referral bonus policy
employee assistance program
pet insurance
travel assistant program
wellbeing and childcare discounts
benefit advocates
learning and development benefit