Service Reliability Engineer

Proofpoint•Tallahassee, FL

445d

About The Position

As a Service Reliability Engineer at Proofpoint, you will play a crucial role in ensuring the reliability and performance of our next-generation security products. This position involves managing and scaling production services across multiple data centers and AWS, contributing to architectural improvements, and collaborating with cross-functional teams to enhance service reliability and capacity.

Requirements

3-5 years' experience managing, troubleshooting, and tuning Linux systems.
Experience with industry-standard foundation technologies such as TCP/IP, HTTP, DNS, SMTP, and LDAP.
Experience in management of a large distributed computing environment.
Experience with virtualization technologies such as KVM, VMware vSphere, and/or OpenStack.
Excellent verbal and written communication skills.
Experience with monitoring and alerting systems.
Experience with industry-standard operational practices such as change management and incident management.
Experience with configuration management tools such as Puppet or Chef.
Experience automating management of systems and applications using Perl, Python, or Ruby.
Experience with load-balancing technologies like F5, Netscaler or similar.
Experience with Kafka, Elastic Search, Cassandra, and MySQL.
Experience with public cloud providers such as Amazon EC2 or Microsoft Azure.
BS or equivalent experience in Computer Science, Engineering or related technical discipline.

Nice To Haves

Experience with automation tools and frameworks.
Familiarity with security best practices in cloud environments.

Responsibilities

Build long lasting, effective partnerships across the organization to foster collaboration between Product, Engineering and Operations teams.
Manage an international 24x7, multi-site production infrastructure powering the Proofpoint services, including deployment, maintenance, troubleshooting, performance tuning, and security.
Root-cause complex problems and involve multiple stakeholders, network, hardware and software that relate to scaling and performance.
Ensure proper monitoring, alerting, capacity planning and reporting in the production environment.
Contribute to the evolving design and architecture of reliable and scalable infrastructure.
Act as the first line of defense during working hours for any alerts or incidents that arise.
Collaborate with product engineering teams to ensure Operations standards are observed, determine resource impacts for upcoming product deployments, and ensure successful product rollouts.
Participate in an on-call rotation and be willing to jump on escalated issues as needed.

Benefits

Flexible time off
Robust well-being program with 4 global wellbeing days per year
3-week work from anywhere option

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Mid Level

Industry

Administrative and Support Services

Education Level

Bachelor's degree

Service Reliability Engineer

About The Position

Requirements

Nice To Haves

Responsibilities

Benefits

What This Job Offers

Job Search Resources

Tools

Career Hubs

Guides

Company