Senior Manager, Security Production Engineering

Core WeaveLivingston, NJ
74d$188,000 - $275,000Hybrid

About The Position

CoreWeave is the AI Hyperscaler, delivering a cloud platform of cutting edge services powering the next wave of AI. Our technology provides enterprises and leading AI labs with the most performant, efficient and resilient solutions for accelerated computing. Since 2017, CoreWeave has operated a growing footprint of data centers covering every region of the US and across Europe. CoreWeave was ranked as one of the TIME100 most influential companies of 2024. As the leader in the industry, we thrive in an environment where adaptability and resilience are key. Our culture offers career-defining opportunities for those who excel amid change and challenge. If you're someone who thrives in a dynamic environment, enjoys solving complex problems, and is eager to make a significant impact, CoreWeave is the place for you. Join us, and be part of a team solving some of the most exciting challenges in the industry. CoreWeave powers the creation and delivery of the intelligence that drives innovation.

Requirements

  • A proven leader who can set direction, motivate teams, and scale engineering organizations
  • A technologist who thrives on solving complex distributed systems problems in cloud environments
  • A collaborator who builds strong relationships across security, infrastructure, and product teams
  • A mentor who invests in people and builds a culture of continuous learning, security, and operational excellence

Nice To Haves

  • Familiarity with observability platforms such as Prometheus, Grafana, or Datadog
  • Experience with cloud infrastructure providers such as AWS, Azure, or GCP
  • Experience defining and tracking SLIs and SLOs for production systems
  • Strong background in incident management practices at scale
  • Ability to communicate technical strategies and recommendations to both technical and executive stakeholders

Responsibilities

  • Leading and scaling the Production Engineering, Security team by setting vision, priorities, and execution strategies
  • Designing, implementing, and maintaining scalable and highly available security-related infrastructure leveraging Kubernetes and cloud-native technologies
  • Driving the development of automation, monitoring, and observability solutions to proactively detect and mitigate reliability or performance risks
  • Establishing and tracking Service Level Objectives (SLOs) for uptime, latency, and resiliency of critical security-related infrastructure
  • Overseeing incident management practices including root cause analysis, stakeholder communication, and preventative improvements
  • Partnering with cross-functional teams such as Infrastructure, Platform, Security Engineering, and Applications to ensure secure-by-default and resilient system designs
  • Acting as a thought leader and mentor by coaching engineers on best practices in reliability engineering, infrastructure management, and security operations
  • Driving continuous improvement in processes, tools, and automation for secure and scalable infrastructure

Benefits

  • Medical, dental, and vision insurance - 100% paid for by CoreWeave
  • Company-paid Life Insurance
  • Voluntary supplemental life insurance
  • Short and long-term disability insurance
  • Flexible Spending Account
  • Health Savings Account
  • Tuition Reimbursement
  • Ability to Participate in Employee Stock Purchase Program (ESPP)
  • Mental Wellness Benefits through Spring Health
  • Family-Forming support provided by Carrot
  • Paid Parental Leave
  • Flexible, full-service childcare support with Kinside
  • 401(k) with a generous employer match
  • Flexible PTO
  • Catered lunch each day in our office and data center locations
  • A casual work environment
  • A work culture focused on innovative disruption
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service