Site Reliability Engineer II

PROSHouston, TX
1d

About The Position

Site Reliability Engineer II About PROS: PROS, Inc. is the leading offer management provider to the airline industry, helping airlines deliver seamless retail experiences designed to maximize revenue and margin growth. Powered by AI, the PROS Platform enables commercial teams to align capacity with demand and coordinate pricing, merchandising and offer strategies to construct and market optimal offers in real time. By optimizing every customer interaction, PROS helps airlines improve revenue performance and quality, increase commercial agility, attract more customers and build lasting loyalty. Learn more at pros.com. Day in the Life of the Site Reliability Engineer II: The Site Reliability Engineer II optimizes service performance, actively participates in reliability improvements, and conducts in-depth SLO and capacity analysis. This position exists to enhance system reliability and scalability while contributing to automation and self-service tool development.

Requirements

  • 5+ years of experience in enterprise networking, including hands‑on work with routing, switching, firewalls, load balancers, and VPN technologies.
  • Strong understanding of cloud networking architectures across including VPC/VNet design, peering, private link, and hybrid connectivity models.
  • Experience with network security technologies, such as security groups, NACLs, firewall policies, WAF, IDS/IPS, and micro‑segmentation.
  • Proficiency in Layer 2 and Layer 3 network protocols, including BGP, OSPF, EIGRP, DNS, DHCP, NAT, and IP addressing/subnetting.
  • Hands‑on experience with load balancers and ingress technologies, including F5, NGINX, Azure Application Gateway, ALB/NLB, or equivalent.
  • Strong troubleshooting skills using packet analyzers tools, flow logs, and network monitoring platforms.
  • Skilled in analyzing performance trends and identifies optimization opportunities.
  • Collaborates with teams to improve monitoring coverage.
  • Ability to participate in structured reliability testing and analysis.
  • Able to evaluate system components for resilience.
  • Contributes to reliability-focused design discussions.
  • Skilled in analyzing trends to inform service improvements.
  • Collaborates with teams to align SLOs with user expectations.
  • Develops moderately complex automation tools.
  • Skill in building internal self-service capabilities.
  • Evaluates automation opportunities for operational efficiency.
  • Skilled in analyzing capacity data to inform scaling decisions.
  • Able to recommend improvements for resource utilization.
  • Ensures scalability is considered in feature development.

Nice To Haves

  • Bachelor’s Degree in Computer Science, Information Technology, or a related field
  • Practical experience with Fortigate firewalls and F5 appliances is highly desirable
  • AI Fluency & Growth Mindset- We welcome candidates who: Understand core AI concepts and apply them ethically to enhance productivity, insights, and decision-making. Craft effective prompts to optimize the quality and relevance of AI-generated outputs. Explore and apply agentic AI systems, using or managing autonomous agents to streamline workflows and automate tasks. Leverage AI tools to boost efficiency, creativity, and innovation in their daily work. Stay curious and adaptable, continuously experimenting with AI-driven solutions to elevate team performance and customer impact.

Responsibilities

  • Performance Monitoring: Monitor service performance, assist in troubleshooting production issues, and learn system architecture.
  • Reliability Participation: Monitor service reliability, participate in resolving basic issues, and learn disaster recovery testing procedures.
  • SLO Implementation: Understand SLO concepts, monitor and analyze SLO patterns, and assist in implementing SLO visualization and alerting.
  • Capacity Analysis: Perform basic capacity analysis, identify trends in system capacity, and participate in capacity planning.
  • Automation Deployment: Deploy and maintain existing automation tools, create simple scripts, and troubleshoot automation scripts.
  • Follow predefined procedures to deploy PROS products and third-party applications to the Cloud environments.
  • Contribute to the release management documentation.
  • Gain understanding of application architecture and interaction between system components.

Benefits

  • flexible ways of working
  • continuous learning
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service