Staff Machine Learning Engineer

Cisco Systems, Inc.San Jose, CA
24d

About The Position

Join the engineering team building the intelligent backbone of Splunk Observability Cloud. We are committed to leveraging the latest advancements in data science and machine learning to unlock unprecedented value from massive volumes of telemetry-metrics, traces, and logs-at petabyte scale. This role involves researching, developing, and deploying core analytical components focused on streaming anomaly detection, predictive intelligence, and automated root-cause analysis. If you thrive on the challenge of building enterprise-grade, scalable ML systems and applying sophisticated techniques to complex, high-impact problems, you will be instrumental in delivering the full-stack, real-time answers and automation required for our customers to achieve true digital resilience across any cloud-native or hybrid environment.

Requirements

  • Master's degree in computer science or related field and 7+ years of software engineering experience, or bachelor's degree with 10+ years of experience.
  • Experience designing and building scalable cloud-based systems (AWS, Azure, or GCP), including container orchestration (e.g., Kubernetes, Docker).
  • Proven experience in technical leadership, architecture design, and end-to-end feature ownership in AI/ML or platform domains.
  • Experience with API design and frameworks (e.g. OpenAPI, GraphQL, gRPC, REST, etc.)
  • Prior working experience in delivering RAG and Agentic products into production
  • Expert at using vibe coding tools (Claude Code, Codex, Copilot, Windsurf, Cursor) is a must
  • Up To Date knowledge on the latest Agentic and Generative AI industry trend and framework

Nice To Haves

  • Exceptional problem-solving skills, with the ability to analyze complex requirements and propose effective solutions.
  • Experience developing, deploying, and maintaining applications in AWS environment with cloud native solutions.
  • Experience monitoring and analyzing metrics, trace, span, and log content
  • Background in observability, generative AI, or model robustness

Responsibilities

  • Apply the latest Generative AI and Agentic AI to enable AI features in Splunk Observability
  • Collaborate across engineering and product teams to establish robust frameworks for evaluating AI systems' trustworthiness and resilience.
  • Provide technical leadership and mentorship within the team, establishing leading practices for development, testing, and artifact management.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Mid Level

Industry

Professional, Scientific, and Technical Services

Number of Employees

5,001-10,000 employees

© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service