Tiktok - Mountain View, CA

posted about 1 month ago

Full-time - Mid Level
Hybrid - Mountain View, CA
Computing Infrastructure Providers, Data Processing, Web Hosting, and Related Services

About the position

The Site Reliability Engineer (SRE) position at TikTok's U.S. Data Security (USDS) division focuses on ensuring the reliability of TikTok's video system, which is crucial for providing a seamless experience to billions of users. The role involves managing production systems, monitoring incidents, and optimizing infrastructure to handle high traffic volumes, especially during significant events. The SRE will work collaboratively within a hybrid work environment, contributing to the overall mission of TikTok to inspire creativity and bring joy.

Responsibilities

  • Responsible for overall reliability of TikTok's video system, including video publishing and distribution.
  • Perform lifecycle management of production systems including change management, service deployment, operations and emergency response.
  • Monitor the system and respond to incidents to maintain system service level agreement (SLA), review and follow up all production incidents.
  • Perform capacity management of compute, storage and network bandwidth resources to ensure system stability and save infrastructure costs.
  • Provide strong support during big events to ensure the system is capable of consuming a large volume of Internet traffic.
  • Build tools, automations, visualizations and monitors to facilitate the operation and optimization of the global infrastructure.

Requirements

  • Bachelor's degree in Computer Science or a related technical background involving software/system engineering, or equivalent working experience.
  • 2+ years of SRE or DevOps experience in large scale online services.
  • Programming experience with at least one of the following languages: C, C++, Java, Python, C# or Go.

Nice-to-haves

  • Extensive knowledge of networking, operation system, database system and container technology.
  • Good understanding of every aspect of microservice architecture, and hands on experience in troubleshooting in large scale distributed systems.
  • Hands on experience in common opensource systems such as Linux, MySQL, MongoDB, Redis and ELK.
  • Experience in building solutions with AWS, Google, Azures and other cloud services is a plus.
  • Passionate, self-motivated and good teamwork skills.

Benefits

  • Medical, dental, and vision insurance from day one.
  • 401(k) savings plan with company match.
  • Paid parental leave.
  • Short-term and long-term disability coverage.
  • Life insurance.
  • Wellbeing benefits.
  • 10 paid holidays per year.
  • 10 paid sick days per year.
  • 17 days of Paid Personal Time (prorated upon hire with increasing accruals by tenure).
Job Description Matching

Match and compare your resume to any job description

Start Matching
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service