Principal Data Architect – Modern Data Platforms & Data Mesh

McAfee•San Jose, CA

19h•Hybrid

About The Position

Strategic Architecture & Data Mesh: Lead the design, evolution, and implementation of a scalable, decentralized Data Mesh architecture. Define domain boundaries, data products, and federated governance standards to enable self-service analytics across the organization. Hands-on Engineering Leadership: Provide expert-level, hands-on technical leadership in building foundational data frameworks. Architect and prototype resilient data pipelines using Databricks, Snowflake, and Spark , ensuring high availability and low latency. Modern Data Storage & Formats: Drive the adoption of open table formats ( Delta Lake, Apache Iceberg ) to ensure ACID compliance, time travel, and schema evolution capabilities across the data lakehouse. Data Security & Governance: Architect advanced data security patterns, including dynamic data masking , tokenization , and row-level security. Implement and manage centralized discovery and access control using Unity Catalog or similar enterprise data catalogs. Real-Time & Event-Driven Systems: Design and deploy high-throughput streaming architectures using Kafka or Pulsar for real-time data ingestion and processing. MLOps & Analytics Integration: Collaborate closely with data scientists to operationalize machine learning models. Build robust MLOps pipelines and feature stores that bridge the gap between data engineering and production AI/ML. Mentorship & Culture: Mentor and guide a team of senior data engineers and architects, fostering a culture of technical excellence, automation (CI/CD), and "Data as a Product" thinking. Operational Excellence: Troubleshoot complex performance bottlenecks in distributed systems and establish best practices for cost optimization and observability (Data Observability) in cloud environments.

Requirements

10+ years of professional software and data engineering experience, with a significant portion in a technical leadership or architectural role.
Data Mesh Expertise: Proven track record of transforming monolithic data warehouses into decentralized Data Mesh architectures, including experience with data contracts and federated governance.
Modern Platform Proficiency: Deep, hands-on expertise with modern data platforms such as Databricks and Snowflake , including optimizing compute clusters and managing costs.
Big Data Ecosystem: Strong proficiency in distributed computing frameworks (Apache Spark) and open standards ( Delta Lake, Iceberg, Hudi ).
Security First: Extensive experience implementing data privacy regulations (GDPR, CCPA) via technical controls like data masking, tokenization, and encryption at rest/in transit.
Coding & Scripting: Expert-level proficiency in Python, SQL, and Scala/Java . You treat data pipelines as code, strictly adhering to CI/CD , unit testing, and version control best practices.
Streaming & Orchestration: Deep understanding of event-driven architectures ( Kafka , Kinesis) and workflow orchestration tools (Airflow, dbt, Dagster).
Catalog & Discovery: Hands-on experience implementing data discovery and governance tools such as Unity Catalog , Alation, or Collibra.
MLOps Familiarity: Strong understanding of the end-to-end machine learning lifecycle, including model registry, feature stores, and model serving

Responsibilities

Lead the design, evolution, and implementation of a scalable, decentralized Data Mesh architecture.
Define domain boundaries, data products, and federated governance standards to enable self-service analytics across the organization.
Provide expert-level, hands-on technical leadership in building foundational data frameworks.
Architect and prototype resilient data pipelines using Databricks, Snowflake, and Spark , ensuring high availability and low latency.
Drive the adoption of open table formats ( Delta Lake, Apache Iceberg ) to ensure ACID compliance, time travel, and schema evolution capabilities across the data lakehouse.
Architect advanced data security patterns, including dynamic data masking , tokenization , and row-level security.
Implement and manage centralized discovery and access control using Unity Catalog or similar enterprise data catalogs.
Design and deploy high-throughput streaming architectures using Kafka or Pulsar for real-time data ingestion and processing.
Collaborate closely with data scientists to operationalize machine learning models.
Build robust MLOps pipelines and feature stores that bridge the gap between data engineering and production AI/ML.
Mentor and guide a team of senior data engineers and architects, fostering a culture of technical excellence, automation (CI/CD), and "Data as a Product" thinking.
Troubleshoot complex performance bottlenecks in distributed systems and establish best practices for cost optimization and observability (Data Observability) in cloud environments.

Benefits

Bonus Program
401k Retirement Plan
Medical, Dental, Vision, Basic Life, Short Term Disability and Long-Term Disability Coverage
Paid Parental Leave
Support for Community Involvement
14 Paid Company Holidays
Unlimited Paid Time Off for Exempt Employees
96 Hours of Sick Time and 120 Hours of Vacation for Non-Exempt Employees Accrued Each Year

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume