Principal Data Architect – Modern Data Platforms & Data Mesh

McAfee•San Jose, CA

1d•Hybrid

About The Position

We are a global leader in online protection, dedicated to making the digital world a safer place. We are seeking a highly experienced and hands-on Principal Data Architect with a strong background in big data engineering, modern lakehouse architectures, and decentralized data governance. This is a unique opportunity to define the strategic vision for our next-generation data platform, transition our ecosystem to a Data Mesh paradigm, and implement robust security standards across a petabyte-scale environment. We highly value experience gained at FAANG or other leading Big Tech companies. This is a Hybrid position located at either or San Jose or Newport Beach, CA offices. You will be required to be on-site 2 to 3 days per week. When you are not working on-site, you will be working from your home office. We are only considering candidates within a commutable distance to either San Jose or Newport Beach, CA offices and are not offering relocation assistance at this time About the Role: Strategic Architecture & Data Mesh: Lead the design, evolution, and implementation of a scalable, decentralized Data Mesh architecture. Define domain boundaries, data products, and federated governance standards to enable self-service analytics across the organization. Hands-on Engineering Leadership: Provide expert-level, hands-on technical leadership in building foundational data frameworks. Architect and prototype resilient data pipelines using Databricks, Snowflake, and Spark, ensuring high availability and low latency. Modern Data Storage & Formats: Drive the adoption of open table formats (Delta Lake, Apache Iceberg) to ensure ACID compliance, time travel, and schema evolution capabilities across the data lakehouse. Data Security & Governance: Architect advanced data security patterns, including dynamic data masking, tokenization, and row-level security. Implement and manage centralized discovery and access control using Unity Catalog or similar enterprise data catalogs. Real-Time & Event-Driven Systems: Design and deploy high-throughput streaming architectures using Kafka or Pulsar for real-time data ingestion and processing. MLOps & Analytics Integration: Collaborate closely with data scientists to operationalize machine learning models. Build robust MLOps pipelines and feature stores that bridge the gap between data engineering and production AI/ML. Mentorship & Culture: Mentor and guide a team of senior data engineers and architects, fostering a culture of technical excellence, automation (CI/CD), and "Data as a Product" thinking. Operational Excellence: Troubleshoot complex performance bottlenecks in distributed systems and establish best practices for cost optimization and observability (Data Observability) in cloud environments.

Requirements

10+ years of professional software and data engineering experience, with a significant portion in a technical leadership or architectural role.
Data Mesh Expertise: Proven track record of transforming monolithic data warehouses into decentralized Data Mesh architectures, including experience with data contracts and federated governance.
Modern Platform Proficiency: Deep, hands-on expertise with modern data platforms such as Databricks and Snowflake, including optimizing compute clusters and managing costs.
Big Data Ecosystem: Strong proficiency in distributed computing frameworks (Apache Spark) and open standards (Delta Lake, Iceberg, Hudi).
Security First: Extensive experience implementing data privacy regulations (GDPR, CCPA) via technical controls like data masking, tokenization, and encryption at rest/in transit.
Coding & Scripting: Expert-level proficiency in Python, SQL, and Scala/Java. You treat data pipelines as code, strictly adhering to CI/CD, unit testing, and version control best practices.
Streaming & Orchestration: Deep understanding of event-driven architectures (Kafka, Kinesis) and workflow orchestration tools (Airflow, dbt, Dagster).
Catalog & Discovery: Hands-on experience implementing data discovery and governance tools such as Unity Catalog, Alation, or Collibra.
MLOps Familiarity: Strong understanding of the end-to-end machine learning lifecycle, including model registry, feature stores, and model serving

Responsibilities

Lead the design, evolution, and implementation of a scalable, decentralized Data Mesh architecture.
Define domain boundaries, data products, and federated governance standards to enable self-service analytics across the organization.
Provide expert-level, hands-on technical leadership in building foundational data frameworks.
Architect and prototype resilient data pipelines using Databricks, Snowflake, and Spark, ensuring high availability and low latency.
Drive the adoption of open table formats (Delta Lake, Apache Iceberg) to ensure ACID compliance, time travel, and schema evolution capabilities across the data lakehouse.
Architect advanced data security patterns, including dynamic data masking, tokenization, and row-level security.
Implement and manage centralized discovery and access control using Unity Catalog or similar enterprise data catalogs.
Design and deploy high-throughput streaming architectures using Kafka or Pulsar for real-time data ingestion and processing.
Collaborate closely with data scientists to operationalize machine learning models.
Build robust MLOps pipelines and feature stores that bridge the gap between data engineering and production AI/ML.
Mentor and guide a team of senior data engineers and architects, fostering a culture of technical excellence, automation (CI/CD), and "Data as a Product" thinking.
Troubleshoot complex performance bottlenecks in distributed systems and establish best practices for cost optimization and observability (Data Observability) in cloud environments.