Senior Data Scientist

Hewlett Packard EnterpriseSunnyvale, CA
Onsite

About The Position

Hewlett Packard Enterprise is the global edge-to-cloud company, helping companies connect, protect, analyze, and act on their data and applications from edge to cloud. The company fosters a culture that values varied backgrounds, flexibility, and bold moves. The Senior Data Scientist will be involved in data science research and software application development for HPE's AI Datacenter technology and autonomous platform. This role focuses on providing visibility and operational efficiency into the user experience. The individual will collaborate with other engineers to build next-generation autonomous Datacenter networks using big data and predictive models. They will leverage network data to empower the inference engine of the Mist platform and systems, including the Mist virtual assistant chat bot. Utilizing knowledge of network communication, machine learning, and software engineering, the Senior Data Scientist will develop and implement scalable algorithms to process large amounts of streaming data for real-time anomaly detection, problem prediction, Root Cause Analysis (RCA), and classification. The role also includes developing software and algorithms to enhance cloud intelligence for Marvis.

Requirements

  • Bachelor's degree in Computer Science/ Engineering/Mathematics or equivalent experience
  • 5+ years of experience Search Indexing, Ranking, Information Retrieval and Querying
  • Proficient in Python and Golang
  • Proficient in implementing NLP, Machine Learning models and algorithms into production at scale
  • Solid statistics and math background, good knowledge of machine learning methods like k-Nearest Neighbors, Naive Bayes, SVM, Decision Forests
  • Excellent Communication Skills to articulate observations and use cases with PM and network domain experts who are not experienced in AI/ML through data visualization tool
  • Good understanding of datacenter networking topology and protocols
  • Knowledge of the multi-cloud production environment
  • Agility to troubleshoot open-source data processing engine, such as Apache Spark, Apache Storm and Apache Flink
  • Good knowledge and experience of the big data tool sets and techniques of distributed storage and computation engine
  • Good understanding of MCPs and Agentic frameworks

Nice To Haves

  • PhD degree in Statistics, Operations Research, Computer Science or equivalent and 5+ years of relevant experience
  • Master´s Degree in Statistics, Operations Research, Computer Science or equivalent and at least 8 years of relevant experience
  • Experience with statistical data analysis, data mining, and querying
  • Experience in deploying and leading complete ML platforms in AWS/GCP/Azure
  • Experience with time series data analysis, forecasting and correlation
  • Experience with latest AI/ML techniques, such as Neural Networks, Transformer, etc. for time series data or interested to explore these techniques for time series data
  • Accountability
  • Action Planning
  • Active Learning
  • Active Listening
  • Agile Methodology
  • Agile Scrum Development
  • Analytical Thinking
  • Bias
  • Coaching
  • Creativity
  • Critical Thinking
  • Cross-Functional Teamwork
  • Data Analysis Management
  • Data Collection Management (Inactive)
  • Data Controls
  • Design
  • Design Thinking
  • Empathy
  • Follow-Through
  • Group Problem Solving
  • Growth Mindset
  • Intellectual Curiosity (Inactive)
  • Long Term Planning
  • Managing Ambiguity

Responsibilities

  • Design and implement machine learning solutions which require to process terabytes of streaming data to detect anomalies in DC networks of our customers, predict problems and future trends, provide Root Cause Analysis
  • Analyze feature requirements from product manager, collaborate with engineers and data scientists to design the solutions
  • Troubleshoot production environment and customer reported issues
  • Utilize analytical and programming skills and open-source systems, such as Hadoop, Hive, Spark, Elasticsearch, Redis, etc. develop data processing pipeline required efficacy and latency
  • Develop the reusable and highly scalable data processing component
  • Work with cloud based CICD tools and cloud devops teams to collect stats and create monitors for our data processing pipelines

Benefits

  • Comprehensive suite of benefits that supports physical, financial and emotional wellbeing
  • Programs catered to helping you reach any career goals
  • Unconditional inclusion in the way we work and celebrate individual uniqueness
  • Flexibility to manage work and personal needs
  • Variable incentives
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service