Distinguished Software Engineer (Data Security - Big Data & AI )

Palo Alto Networks•Santa Clara, CA

40d•$230,000 - $300,000•Onsite

About The Position

At Palo Alto Networks, we are redefining cybersecurity. As a Distinguished Engineer on the Enterprise DLP team, you will be the foremost technical leader responsible for architecting and scaling the data platform that underpins our industry-leading cloud-delivered DLP service. Your mission is to establish the standards and systems necessary to process and analyze massive volumes of sensitive data, leveraging cutting-edge AI/ML, to ensure our customers' data remains protected across all network, cloud, and user vectors.

Requirements

BS/MS in Computer Science or Electrical Engineering or equivalent experience or equivalent military experience required

Nice To Haves

12+ years of experience in a high-scale data-intensive environment, with a minimum of 3+ years operating as a Distinguished or Principal-level Engineer/Architect
Mastery of Google Cloud Platform (GCP) with extensive, hands-on experience architecting and scaling solutions using BigQuery and Vertex AI or equivalent AWS, Azure, or other Big Data & AI services
Expertise in Big Data processing frameworks and managed services, specifically with building and scaling data and analytics pipelines using Dataflow, Pub/Sub, and GKE (or equivalent technologies like Apache Spark/Kafka)
Strong experience in SQL & NoSQL databases (e.g., MongoDB, Cassandra, Spanner), with an understanding of their respective architectural trade-offs for distributed systems
Demonstrated ability to design scalable data models and systems that enable high-precision
Proven ability to build and optimize clean, well-structured analytical datasets for large-scale business and data science use cases
Demonstrated experience in implementing and supporting Big Data solutions for both batch (scheduled) and real-time (streaming) analytics
Prior experience in the security domain (especially DLP, Data Security, or Cloud Security) is a significant advantage
Exceptional ability to influence technical and business leaders, translating ambiguous problems into clear, executable technical designs

Responsibilities

Define Architectural Roadmap: Set the 3-5 year technical strategy and architectural vision for the Enterprise DLP data platform, emphasizing scalability, performance, security, and cost-efficiency
Big Data & AI Foundation: Drive the design, implementation, scaling, and evangelism of the core BigQuery, Vertex AI, Nvidia Triton, Kubeflow platform components that enable high-velocity data ingestion, transformation, and Machine Learning model serving for DLP detections
Real-time Decisioning: Architect and implement ultra-low latency data ingestion and processing systems (utilizing Kafka, Pub/Sub, Dataflow) to enable real-time DLP policy enforcement and alert generation at massive enterprise scale
Cross-Functional Influence: Act as the technical voice of the DLP data platform, collaborating with Engineering VPs, Product Management, and Data Science teams to align platform capabilities with product innovation
Big Data Pipeline Mastery: Architect and Lead the design and implementation of highly resilient, optimized batch and real-time data pipelines (ETL/ELT) to transform raw data streams into high-quality, actionable datasets
Optimized Datasets: Expertly design and optimize clean, well-structured analytical datasets within BigQuery, focusing on partitioning, clustering, and schema evolution to maximize query performance for both operational analytics and complex data science/ML feature generation
Database Strategy: Provide deep, hands-on expertise in both SQL and NoSQL databases like MongoDB, Spanner, BigQuery, advising on the optimal data persistence layer for diverse DLP data use cases (e.g., policy configurations, high-speed telemetry, analytical fact tables)
MLOps Implementation: Establish robust MLOps practices model deployment & execution pipelines like Vertex AI, Nvidia Triton for DLP models, including automated pipelines for continuous training, versioning, deployment, and monitoring of model drift
Performance Engineering: Debug, optimize, and tune the most challenging performance bottlenecks across the entire data platform, from initial data ingestion to final analytics query execution, often dealing with PBs of data
Technical Mentorship: Mentor and develop Principal and Staff-level engineers, raising the bar for engineering craftsmanship and data platform development across the organization
Operational Health: Define and implement advanced observability, monitoring, and alerting strategies to ensure the end-to-end health and SLOs of the mission-critical DLP data service