Jobgether-posted 8 days ago
$110,000 - $270,000/Yr
Full-time • Mid Level
Remote
11-50 employees

This role offers the opportunity to shape and maintain the reliability and scalability of a large-scale, global enterprise platform. The Senior SRE will work closely with engineering teams to design cloud infrastructure, optimize production systems, and automate operational processes. The position requires hands-on expertise across backend systems, Java development, and open-source technologies, along with a strong operational mindset to handle incident management and root cause analysis. You will influence system design, mentor peers, and contribute to cross-functional initiatives, ensuring a highly available, performant platform used by millions worldwide. This is a high-impact role within a collaborative, fast-paced environment that values problem-solving, innovation, and technical excellence.

  • Build, maintain, and optimize cloud infrastructure to support enterprise-scale applications.
  • Ensure platform reliability and scalability for hundreds of global customers across multiple regions.
  • Lead incident triage and mitigation efforts, including on-call rotations for critical escalations.
  • Develop automation tools to reduce manual operational tasks and accelerate issue resolution.
  • Provide full-stack diagnostics, identify root causes, and implement preventive measures for production issues.
  • Participate in engineering design reviews and guide development decisions for large-scale systems.
  • Collaborate effectively with Product Management, Design, QA, and other engineering teams to deliver reliable, customer-facing solutions.
  • Focus on backend system architecture while contributing to frontend and infrastructure initiatives as needed.
  • Mentor team members and foster a positive, high-performing engineering culture.
  • Communicate clearly with technical and non-technical stakeholders during incidents and project discussions.
  • 5+ years of experience in Java development, ideally in enterprise cloud or high-growth technology companies.
  • Hands-on operational experience managing high-volume production systems, including incident management and root cause analysis.
  • Strong coding skills, capable of writing clean, testable, maintainable code in collaborative settings.
  • Proficiency with open-source technologies such as Spring, MySQL, Hibernate, Solr, Maven, Git, Tomcat, Linux, AWS, Vagrant, Docker, and Kubernetes.
  • 3+ years of experience with relational databases and expert-level SQL skills.
  • Scripting experience with Shell, Bash, Ansible, Python, Go, Ruby, or similar languages.
  • Demonstrated leadership in incident management and ability to mentor and guide other engineers.
  • Excellent communication skills for cross-functional collaboration and incident reporting.
  • Ability to work Monday–Friday, 6 AM–2 PM EST, with candidates located in the EST or AST time zones.
  • Competitive base salary: $110,000–$270,000, with potential for variable bonus or stock options.
  • Comprehensive health coverage: medical, dental, vision, and basic life insurance.
  • Flexible PTO and company-paid holidays.
  • Retirement savings programs.
  • Charitable giving program supporting personal philanthropy.
  • Remote work flexibility with collaboration across global teams.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service