Distinguished Engineer - AI Infrastructure Architecture

Cisco Systems, Inc.•Cheyenne, WY

34d

About The Position

Splunk, a Cisco company, is building a safer and more resilient digital world with an end-to-end full stack platform made for a hybrid, multi-cloud world. Leading enterprises use our unified security and observability platform to keep their digital systems secure and reliable. Come help organizations be their best, while you reach new heights with a team that has your back. Meet the Team Our Distinguished Engineer team drives architecture and technical direction for Splunk's $3.5B platform, ingesting petabytes of data for over 95% of the Fortune 100. As a tight-knit group of DEs reporting to the VP of Architecture, we operate autonomously while coordinating closely on practical, durable solutions that evolve the platform for future requirements. Each DE owns cross-organizational domains that shift fluidly as needs arise. Work alongside exceptionally smart engineers who value rigorous thinking and friendly collaboration-this is where the technical future gets defined! Impact: Architect and operationalize AI infrastructure as a core component of Cisco Data Fabric, enabling engineering teams to integrate AI capabilities across the world's largest, most diverse data sets. Design MLOps platforms that solve complex lifecycle challenges-managing embedding model migrations, version compatibility, and continuous model updates-while ensuring operational excellence through monitoring, governance, and serving at petabyte scale. Identify and drive new opportunities for leveraging AI across the stack, establishing consistent architectural patterns for agent prompts, tool integration, and model orchestration. Build data pipelines and operational frameworks that enable hundreds of engineers to confidently deploy both LLMs and traditional ML models into production. Success is measured by customer adoption and usage of AI-powered features, alongside operational metrics-model performance, inference latency, and deployment velocity-that directly impact customer value. Why Cisco? At Cisco, we're revolutionizing how data and infrastructure connect and protect organizations in the AI era - and beyond. We've been innovating fearlessly for 40 years to create solutions that power how humans and technology work together across the physical and digital worlds. These solutions provide customers with unparalleled security, visibility, and insights across the entire digital footprint. Fueled by the depth and breadth of our technology, we experiment and create meaningful solutions. Add to that our worldwide network of doers and experts, and you'll see that the opportunities to grow and build are limitless. We work as a team, collaborating with empathy to make really big things happen on a global scale. Because our solutions are everywhere, our impact is everywhere. We are Cisco, and our power starts with you.

Requirements

Bachelor's in Computer Science (or equivalent) with 15+ years of related experience; or Master's with 12+ years; or PhD with 8+ years or equivalent experience
Designed and deployed AI/ML infrastructure and features in production cloud environments
Experience with major AI services including OpenAI, Anthropic, HuggingFace, AWS Bedrock, Azure OpenAI Service, or similar platforms
Production experience with AWS, Azure, or GCP cloud platform
Led technical decisions and architectural direction across engineering organizations of 50+ engineers

Nice To Haves

Experience with LLM-specific infrastructure including agent frameworks, prompt management, and tool integration
Model serving at scale with experience in inference optimization and performance monitoring
Designed and deployed AI/ML infrastructure for on-premises environments
Experience with major ML frameworks (TensorFlow, PyTorch, etc.) and model formats
Proven track record mentoring and growing engineers, with strong collaboration skills

Responsibilities

Architect and operationalize AI infrastructure as a core component of Cisco Data Fabric
Design MLOps platforms that solve complex lifecycle challenges-managing embedding model migrations, version compatibility, and continuous model updates-while ensuring operational excellence through monitoring, governance, and serving at petabyte scale.
Identify and drive new opportunities for leveraging AI across the stack, establishing consistent architectural patterns for agent prompts, tool integration, and model orchestration.
Build data pipelines and operational frameworks that enable hundreds of engineers to confidently deploy both LLMs and traditional ML models into production.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume