About The Position

We are a global leader in online protection, dedicated to making the digital world a safer place. We are seeking a highly experienced and hands-on Principal Data Architect with a strong background in big data engineering, modern lakehouse architectures, and decentralized data governance. This is a unique opportunity to define the strategic vision for our next-generation data platform, transition our ecosystem to a Data Mesh paradigm, and implement robust security standards across a petabyte-scale environment. We highly value experience gained at FAANG or other leading Big Tech companies. This is a Hybrid position located at either or San Jose or Newport Beach, CA offices. You will be required to be on-site 2 to 3 days per week. When you are not working on-site, you will be working from your home office. We are only considering candidates within a commutable distance to either San Jose or Newport Beach, CA offices and are not offering relocation assistance at this time About the Role: Strategic Architecture & Data Mesh: Lead the design, evolution, and implementation of a scalable, decentralized Data Mesh architecture. Define domain boundaries, data products, and federated governance standards to enable self-service analytics across the organization. Hands-on Engineering Leadership: Provide expert-level, hands-on technical leadership in building foundational data frameworks. Architect and prototype resilient data pipelines using Databricks, Snowflake, and Spark, ensuring high availability and low latency. Modern Data Storage & Formats: Drive the adoption of open table formats (Delta Lake, Apache Iceberg) to ensure ACID compliance, time travel, and schema evolution capabilities across the data lakehouse. Data Security & Governance: Architect advanced data security patterns, including dynamic data masking, tokenization, and row-level security. Implement and manage centralized discovery and access control using Unity Catalog or similar enterprise data catalogs. Real-Time & Event-Driven Systems: Design and deploy high-throughput streaming architectures using Kafka or Pulsar for real-time data ingestion and processing. MLOps & Analytics Integration: Collaborate closely with data scientists to operationalize machine learning models. Build robust MLOps pipelines and feature stores that bridge the gap between data engineering and production AI/ML. Mentorship & Culture: Mentor and guide a team of senior data engineers and architects, fostering a culture of technical excellence, automation (CI/CD), and "Data as a Product" thinking. Operational Excellence: Troubleshoot complex performance bottlenecks in distributed systems and establish best practices for cost optimization and observability (Data Observability) in cloud environments.

Requirements

  • 10+ years of professional software and data engineering experience, with a significant portion in a technical leadership or architectural role.
  • Data Mesh Expertise: Proven track record of transforming monolithic data warehouses into decentralized Data Mesh architectures, including experience with data contracts and federated governance.
  • Modern Platform Proficiency: Deep, hands-on expertise with modern data platforms such as Databricks and Snowflake, including optimizing compute clusters and managing costs.
  • Big Data Ecosystem: Strong proficiency in distributed computing frameworks (Apache Spark) and open standards (Delta Lake, Iceberg, Hudi).
  • Security First: Extensive experience implementing data privacy regulations (GDPR, CCPA) via technical controls like data masking, tokenization, and encryption at rest/in transit.
  • Coding & Scripting: Expert-level proficiency in Python, SQL, and Scala/Java. You treat data pipelines as code, strictly adhering to CI/CD, unit testing, and version control best practices.
  • Streaming & Orchestration: Deep understanding of event-driven architectures (Kafka, Kinesis) and workflow orchestration tools (Airflow, dbt, Dagster).
  • Catalog & Discovery: Hands-on experience implementing data discovery and governance tools such as Unity Catalog, Alation, or Collibra.
  • MLOps Familiarity: Strong understanding of the end-to-end machine learning lifecycle, including model registry, feature stores, and model serving

Responsibilities

  • Lead the design, evolution, and implementation of a scalable, decentralized Data Mesh architecture.
  • Define domain boundaries, data products, and federated governance standards to enable self-service analytics across the organization.
  • Provide expert-level, hands-on technical leadership in building foundational data frameworks.
  • Architect and prototype resilient data pipelines using Databricks, Snowflake, and Spark, ensuring high availability and low latency.
  • Drive the adoption of open table formats (Delta Lake, Apache Iceberg) to ensure ACID compliance, time travel, and schema evolution capabilities across the data lakehouse.
  • Architect advanced data security patterns, including dynamic data masking, tokenization, and row-level security.
  • Implement and manage centralized discovery and access control using Unity Catalog or similar enterprise data catalogs.
  • Design and deploy high-throughput streaming architectures using Kafka or Pulsar for real-time data ingestion and processing.
  • Collaborate closely with data scientists to operationalize machine learning models.
  • Build robust MLOps pipelines and feature stores that bridge the gap between data engineering and production AI/ML.
  • Mentor and guide a team of senior data engineers and architects, fostering a culture of technical excellence, automation (CI/CD), and "Data as a Product" thinking.
  • Troubleshoot complex performance bottlenecks in distributed systems and establish best practices for cost optimization and observability (Data Observability) in cloud environments.

Benefits

  • Bonus Program
  • Pension and Retirement Plans
  • Medical, Dental and Vision Coverage
  • Paid Time Off
  • Paid Parental Leave
  • Support for Community Involvement
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service