Senior AI Site Reliability Engineer

Charles SchwabSan Francisco, CA
2dOnsite

About The Position

Your Opportunity At Schwab, you will build a rewarding career while making a difference in the lives of our millions of clients. Here, innovative thinking meets creative problem solving as we work together to challenge the status quo. We believe in the power of collaboration and value being together in the office, which is why this role is based on-site in our San Francisco office. Joining Schwab means joining a company committed to transforming the financial industry and putting clients at the center of everything we do. Schwab's AI Strategy & Transformation team, known as AI.x, is the central hub for Artificial Intelligence at Schwab. We are an integrated product, engineering, strategy and risk team, all based in San Francisco. We help set the enterprise vision for AI, invest in the most promising opportunities, and accelerate delivery across the company. We also build the core platform that powers AI at scale and explore next-generation GenAI efforts that will redefine how we serve our clients. As a Senior AI Site Reliability Engineer on AI.x, you will play a key role in ensuring our AI solutions are reliable, scalable, and resilient—enabling us to deliver innovative experiences to millions of clients. This role is more than a reliability engineering position. It is an opportunity to join a high-profile team shaping Schwab's future with AI, to build and maintain solutions that matter to millions of clients, and to grow your career in one of the most exciting areas of technology today. As a Senior AI Site Reliability Engineer, you will design, implement, and manage the reliability and operational excellence of GenAI applications and platforms. You will work closely with architects, engineers, and business leaders to align reliability practices with Schwab's enterprise strategy. You will mentor and coach junior engineers, helping to build strong operational practices and foster a culture of continuous improvement. You will lead by example in solving complex reliability challenges, advancing SRE standards, and driving rapid iteration from concept to production. Above all, you will bring curiosity, creativity, and technical depth to help shape the next generation of reliable AI at Schwab.

Requirements

  • 8+ years of software development or reliability engineering experience, with 4+ years as a hands-on senior engineer in startups and/or large organizations.
  • Bachelor's degree in Computer Science or related field.
  • 5+ years of experience building and operating complex products from scratch and running them in production.
  • 3+ years of experience supporting applications that use Artificial Intelligence (AI) models to deliver real business impact.
  • 3+ years of experience building and maintaining data pipelines and infrastructure for large datasets.
  • 3+ years of experience with containers and cloud-native applications, and the ability to operationalize them in the public cloud with infrastructure as code.
  • Experience implementing monitoring, alerting, and incident response for large-scale distributed systems.
  • Proven track record in driving reliability, scalability, and performance improvements for production AI systems.

Nice To Haves

  • Strong computer science fundamentals and experience working across different parts of the tech stack.
  • Experience working with proprietary or open-source LLMs (Gemini, Claude, OpenAI or other models) and supporting LLM-powered applications in production.
  • Focus on quality and reliability in everything you do. Continue to raise the bar and drive others to deliver high-quality, resilient products, with experience writing tests and implementing automated reliability checks.
  • Experience writing and running evaluations to ensure quality and monitor consistency in LLM-generated responses and actions.
  • Strong communication skills - you balance written and verbal communication to clearly share your perspective with others on the team.
  • Experience mentoring junior engineers and helping them grow their technical and operational skills through clear feedback and code reviews.
  • Demonstrated mindset of continuous learning and improvement.
  • Ability to solve complex problems with ambiguous or incomplete data in highly distributed systems.
  • Demonstrated business domain knowledge related to all products you have worked on.
  • Curiosity about new technologies and processes - you always seek to improve yourself and everyone around you and proactively seek and share knowledge with others on your team.
  • Experience with Python and front-end development preferred but not required.
  • Master's or advanced degrees in Computer Science or related fields.

Benefits

  • In addition to the salary range, this role is also eligible for bonus or incentive opportunities.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service