Senior Staff Software Engineer, Data Infrastructure

Instacart
1d$254,000 - $321,000Remote

About The Position

Instacart's backend systems serve millions of customers every year, processing petabytes of data daily to power everything from personalized grocery recommendations to real-time fraud detection and advertiser measurement. As Senior Staff Software Engineer on the Data Infrastructure team, you will set the technical direction for the data platform that underpins the entire company’s data strategy — from the storage and compute layer to streaming infrastructure, analytics tooling, and governance systems. This is a role for a systems thinker who thrives at the intersection of deep technical depth and company-level strategic impact. You will define the multi-year architecture roadmap for our core data platform, drive major platform investment decisions, and operate as a thought leader both internally across engineering leadership and externally within the data infrastructure community. Your work will directly influence how Instacart makes decisions at scale and shape the platform economics of one of the most data-intensive technology companies in grocery commerce. The Data Infrastructure team builds and operates the systems that power Instacart’s data ecosystem — a modern data lakehouse built on Apache Iceberg, a multi-engine compute platform spanning stream processing and analytical query engines, and self-serve tooling that enables Product, Data Science, ML, Ads, Finance, and other engineering teams to move fast with data. We balance two mandates: maintaining a highly reliable production data platform serving the business today, and architecting the infrastructure that will serve it for the next three to five years. We operate in a world of significant scale and material infrastructure spend, where architectural decisions carry both technical and financial consequences. Some of the core technologies in our stack include: Apache Iceberg, Apache Flink, Trino, ClickHouse, Apache Kafka, Apache Spark, Snowflake, Databricks, Confluent, Airflow, dbt, Delta Lake, Scala, Python, Postgres, and AWS. The team operates with a high degree of ownership and autonomy. You will work closely with engineering leadership, data science, ML platform, ads infrastructure, finance engineering, and senior stakeholders across the organization.

Requirements

  • 10+ years of software engineering, focused on data infrastructure or distributed systems at scale.Sets technical direction for large-scale data platforms, defining multi-year architecture roadmaps and influencing strategy. Experience in high-growth, data-intensive environments with significant infrastructure scale and spend.
  • Expertise in modern data lakehouse architectures, open table formats (Iceberg, Delta Lake, Hudi), and compute/storage trade-offs, in distributed query/compute systems (Trino, Spark, ClickHouse, etc.) for performance tuning and production reliability and event-driven infrastructure (Kafka, Flink, etc.)
  • Proven track record owning and executing major infrastructure platform transitions, including build vs. buy, migration design, and risk management.
  • Experience building compelling business cases for infrastructure investments, including cost-benefit analysis and TCO modeling.
  • Exceptional technical communication for clear architecture documents, strategy memos, and proposals to drive leadership alignment. Strong ownership, comfort with ambiguity, and organizational influence to drive large, multi-team initiatives from concept to production.

Nice To Haves

  • Familiarity with data governance, compliance frameworks (SOX, CPRA, GDPR), and designing governance controls into the platform architecture.
  • Experience with FinOps and data platform cost optimization, including managing multi-million dollar infrastructure budgets and negotiating vendor contracts.
  • Deep knowledge of SQL and strong proficiency in Python or Scala for systems-level work.
  • Experience with orchestration systems (e.g., Apache Airflow) and data transformation pipelines (e.g., dbt) in large-scale production environments.
  • Track record of building and growing high-performing data infrastructure teams.
  • Bachelor’s, Master’s, or PhD in Computer Science, Computer Engineering, Electrical Engineering, or equivalent practical experience.

Responsibilities

  • Define and Drive Data Infrastructure Vision: Own the multi-year technical vision and roadmap for Instacart’s core data platform (storage, compute, streaming, orchestration, analytical serving). Translate company data strategy (monetization, federated access, real-time) into a coherent, actionable architecture plan. Align with leadership and proactively evolve the architecture for scale, maturity, and cost.
  • Lead Platform Strategy (Build, Buy, Ownership): Architect the ownership strategy for the data platform, determining build vs. buy (including managed services vs. open-source self-hosting). Lead technical/business case evaluations, full cost-benefit modeling, and risk analysis for major investments. Design phased migrations to ensure reliability while achieving long-term independence and cost efficiency.
  • Own the Data Lakehouse Foundation: Drive the architecture and delivery of the open lakehouse, including unified table format, compute engine portfolio, and storage governance. Expand multi-engine compute (interactive, batch, stream processing). Define standards for data storage, access, governance, and sharing to enable compute portability and prevent lock-in. Ensure reliable scaling without proportional cost increase.
  • Drive Real-Time and Streaming Infrastructure: Own the architecture for streaming data, event-driven pipelines, stream processing, and real-time serving for critical use cases (Ads, Fraud, ML). Make principled decisions on deployment models balancing cost, availability, and operational maturity.
  • Pioneer AI-native Data Infrastructure Engineering: Lead the adoption, application, and cultural integration of AI/LLM tools across the data platform lifecycle, setting a high standard for AI-augmented workflows, driving high-leverage opportunities from automation to cost optimization, and partnering with other teams to embed AI-powered capabilities into the platform itself.
  • Elevate Engineering Excellence: Serve as the senior technical voice, setting standards for system design and reliability. Lead architecture reviews. Mentor staff/senior engineers, fostering ownership and execution. Be a visible engineering leader, contributing to hiring and cross-org alignment.
  • Partner Deeply with Stakeholders: Collaborate with Data Science, ML Platform, Ads Infra, Product Eng, Finance Eng, and Security to translate needs into reliable, self-serve infrastructure. Represent Data Infra in architectural forums, ensuring decisions support business priorities (monetization, compliance, AI). Clearly communicate complex trade-offs to technical and executive audiences.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service