Vista Equity Partners - Coral Gables, FL

posted about 1 month ago

Full-time
Coral Gables, FL
Funds, Trusts, and Other Financial Vehicles

About the position

The Lead Site Reliability Engineer at Allvue Systems is responsible for developing and implementing strategies to monitor and maintain the health, performance, and security of systems. This role involves collaborating with development and operations teams to ensure effective deployment and maintenance of cloud-based infrastructure, while also championing continuous improvement and high-quality engineering practices within the team.

Responsibilities

  • Develop and implement strategies for the monitoring and alerting of systems health, performance, and security
  • Develop and implement strategies for incident management, problem management, and change management
  • Create and maintain automation tools and code for configuration management, deployment, and maintenance of cloud-based infrastructure
  • Collaborate with development and operations teams to ensure that application and infrastructure changes are properly tested, deployed, and maintained
  • Develop and maintain documentation of system configurations, processes, and procedures
  • Champion an atmosphere of continuous improvement by serving as a coach, mentor, and technical advisor
  • Be a thoughtful technical voice within the team, aiding in diligent architectural decisions and fostering a culture of high-quality code and engineering processes
  • Collaborate with Product and Engineering teams to ensure successful delivery and operation of diverse systems at scale
  • Identify opportunities for improvement in current technology and that of individual systems, and prioritize the remediation of technical debt.

Requirements

  • Strong understanding of DevOps methodologies and SRE best practices
  • Solid understanding of DevOps practices, including CI/CD pipelines, configuration management, and Infrastructure as Code (IaC)
  • Proficiency in scripting or programming languages (PowerShell, Python, or similar) for automation and infrastructure management in AWS and Azure, as well as IAC like Terraform and CloudFormation
  • Deep understanding of networking, security, and identity and access management (IAM) in cloud environments
  • In-depth knowledge of cloud computing concepts, including expertise in designing cloud-based solutions using IaaS, PaaS, and SaaS models
  • Experience with monitoring, observability and logging tools (Datadog, Splunk, Prometheus, Grafana, etc.)
  • Familiarity with cloud architecture patterns, microservices, containers, and serverless computing
  • Proficient in performing in-depth analysis, complex technical troubleshooting, and problem resolution
  • Strong time management skills, ability to multi-task and perform well under pressure
  • Experience working within geographically distributed organizations
  • Professional written and interpersonal skills

Nice-to-haves

  • AWS or Azure certifications (AWS/Azure Solutions Architect, Developer, etc.)
Job Description Matching

Match and compare your resume to any job description

Start Matching
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service