Senior Site Reliability Engineer - Data Infrastructure

ByteDance•Seattle, WA

85d

About The Position

Our Site Reliability Engineering (SRE) team combines software and systems to build and operate large-scale, distributed systems with high reliability and efficiency. In this role, you'll apply your expertise in coding, algorithms, complexity analysis, and system design to solve scaling and reliability challenges. We're looking for a Sr SRE who can provide deep technical leadership, drive architectural improvements, and collaborate effectively across multiple organizations. You'll partner with engineering, product, data, and infrastructure teams to deliver resilient, scalable platforms. This is a highly technical, hands-on role that requires strong problem-solving ability, clear communication, and the ability to influence without formal authority.

Responsibilities

Strong hands-on skills in the design, development, and operation of large-scale cloud infrastructure and distributed systems.
Collaborate with cross-functional teams (e.g., Advertising, Machine Learning, E-commerce, and Core Infra) to drive system reliability, performance, and scalability.
Lead initiatives to automate operations, eliminate toil, and improve overall system efficiency.
Troubleshoot complex production issues, perform root-cause analysis, and drive long-term reliability improvements.
Promote best practices in system design, observability, performance optimization, and cost efficiency.
Communicate complex technical concepts effectively to both technical and non-technical stakeholders.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Career Level

Senior

Industry

Publishing Industries

Number of Employees

5,001-10,000 employees

Senior Site Reliability Engineer - Data Infrastructure

About The Position

Responsibilities

What This Job Offers

Job Search Resources

Tools

Career Hubs

Guides

Company