Braze-posted 12 days ago
Full-time • Mid Level
Hybrid • New York, NY

At Braze, we have found our people. We're a genuinely approachable, exceptionally kind, and intensely passionate crew. We seek to ignite that passion by setting high standards, championing teamwork, and creating work-life harmony as we collectively navigate rapid growth on a global scale while striving for greater equity and opportunity - inside and outside our organization. To flourish here, you must be prepared to set a high bar for yourself and those around you. There is always a way to contribute: Acting with autonomy, having accountability and being open to new perspectives are essential to our continued success. Our deep curiosity to learn and our eagerness to share diverse passions with others gives us balance and injects a one-of-a-kind vibrancy into our culture. If you are driven to solve exhilarating challenges and have a bias toward action in the face of change, you will be empowered to make a real impact here, with a sharp and passionate team at your back. If Braze sounds like a place where you can thrive, we can't wait to meet you. WHAT YOU'LL DO Platform Infrastructure Engineers (PIEs) are responsible for managing, maintaining, and evolving the foundational infrastructure that supports our internal Infrastructure-as-a-Service platform. PIEs specialize in building robust, scalable, and highly available systems such as Kubernetes clusters, Kafka ecosystems, and cloud environments. They apply sound engineering principles, operational discipline, and mature automation to ensure a reliable infrastructure foundation for all platform services and applications. Our team helps to improve automation infrastructure reliability. It empowers Braze's other engineering teams to leverage the infrastructure products and platforms we create easily. Braze operates at a massive scale with over 3.3 billion monthly active users across our customers, collecting hundreds of billions of data points each month and sending billions of messages to end-users daily. We use a diverse technology stack rooted in Ruby on Rails, MongoDB, Redis, Kafka, Kubernetes, and more. As a Platform Software Engineer at Braze, you will collaborate with your team and consumer engineering teams to build and continuously improve the infrastructure as a service platform that every other team at Braze depends on.

  • Design and Manage Infrastructure: Build, optimize, and manage foundational systems such as Kubernetes clusters, Kafka ecosystems, and cloud resources (e.g., EC2, S3)
  • Develop automation frameworks for provisioning and maintaining infrastructure at scale
  • Design scalable architectures to support seamless operations of platform services
  • Ensure Reliability and Performance: Implement high-availability and fault-tolerant infrastructure strategies
  • Collaborate with Platform Software Engineers and Product teams to establish and meet Service Level Objectives (SLOs) for infrastructure components
  • Continuously monitor and optimize infrastructure performance to meet evolving demands
  • Incident Response and Resilience: Be part of a PagerDuty rotation to respond to infrastructure-related incidents
  • Implement failover strategies, backups, and disaster recovery plans to mitigate risks
  • Conduct root cause analyses and retrospectives to improve system resilience
  • Collaboration and Knowledge Sharing: Partner with Platform Software Engineers to integrate infrastructure with service abstractions and APIs
  • Document processes, tools, and best practices to streamline development and operations
  • Share expertise and mentor team members to foster a culture of operational excellence
  • Innovate and Automate: Stay ahead of emerging trends in infrastructure technology and integrate innovative solutions
  • Reduce manual tasks by developing automated solutions for infrastructure provisioning, scaling, and maintenance
  • Optimize for performance, security, and scalability in all aspects of infrastructure design
  • Experience: 5+ years managing and scaling large-scale infrastructure systems in production environments
  • Proven expertise with Kubernetes, Kafka, cloud services (AWS/GCP/Azure), and configuration management tools
  • Skills Proficiency in infrastructure as code (IaC) tools like Terraform, Ansible, or similar
  • Strong understanding of network architecture, security, and performance tuning
  • Familiarity with containerization, service discovery, and load-balancing technologies
  • Have an excellent ability to manage multiple tasks and expectations at once
  • Mindset: Focused on building robust, scalable systems that enhance developer productivity
  • Collaborative and communicative, with a strong desire to document and share knowledge
  • Committed to continuous improvement, staying ahead of technological advancements
  • Have an urge to collaborate, document, and deliver quickly
  • Collaborating across the global remote teams, often working asynchronously.
  • Document everything so you don't need to learn the same thing (or plan the same work) twice
  • Delivering fast to delight our customers - even internal ones
  • Competitive compensation that may include equity
  • Retirement and Employee Stock Purchase Plans
  • Flexible paid time off
  • Comprehensive benefit plans covering medical, dental, vision, life, and disability
  • Family services that include fertility benefits and equal paid parental leave
  • Professional development supported by formal career pathing, learning platforms, and a yearly learning stipend
  • A curated in-office employee experience, designed to foster community, team connections, and innovation
  • Opportunities to give back to your community, including an annual company-wide Volunteer Week and donation matching
  • Employee Resource Groups that provide supportive communities within Braze
  • Collaborative, transparent, and fun culture recognized as a Great Place to Work®
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service