Senior Site Reliability Engineer

AvidXchange, Inc.Charlotte, NC
6hHybrid

About The Position

Senior Site Reliability Engineer is responsible for meaningfully contributing and providing continuous feedback on site health, reliability, availability and user experience Avid products. This is a matrixed role where the SRE will work closely on a day-to-day basis with the product team while reporting to the practice lead. This role is expected to understand the product in depth, collect and analyze meaningful measurements and provide feedback to the business, Software Engineering and Product teams. The SRE will work very closely with the key stakeholders to help drive changes to increase customer satisfaction, product availability, reliability, and the completion of strategic technical initiatives. In addition to monitoring and integration with the observability platform, a heavy focus will be placed on automation opportunities and automating operational processes to maintain high availability of the product.

Requirements

  • General knowledge of most technical expertise areas, with deep knowledge in at least two areas Advanced Terraform syntax, Ansible (syntax, tasks, playbooks) and CI/CD configuration, pipelines, jobs.
  • Advanced knowledge of cloud services (preferably Azure)
  • Monitoring Dynatrace, Azure App Insight, Prometheus, and Grafana: service catalog metrics and recording rules for alerts
  • Log shipping pipelines and incident debugging visualizations
  • Ability to understand and Contribute improvements to the codebase to resolve issues.
  • Proven ability to develop internal applications/tools and contribute code to infrastructure or production systems.
  • 7+ years of software Engineering or Site Reliability Engineering experience
  • Bachelor’s degree in computer science, Information Technology or equivalent experience plus certifications
  • Understanding of web hosting infrastructure and architecture in highly available environments
  • Working knowledge and experience with C#, Javascript, and HTML
  • Experience with one of the Public Cloud architectures (Azure experience highly desired)
  • Familiarity with RESTful API and .Net Applications.
  • Strong hands-on experience with Azure services (e.g., APIM, FunctionApps, Key Vault, App Services) or AWS.
  • Expertise with Kubernetes, Docker, and Helm in production environments.
  • Experience applying Chaos Engineering concepts or tools.
  • Strong scripting and automation capabilities in PowerShell, Python, SQL, and use of workflow tools like Power Automate.
  • Familiarity with CI/CD, DevSecOps practices, Linux system administration, and cloud security principles.

Nice To Haves

  • Microsoft Azure Administrator Associate (AZ-104)
  • Certified Kubernetes Administrator (CKA)
  • HashiCorp Terraform Associate
  • Experience in FinTech or payment systems environments.
  • Experience with containerizing legacy applications for Kubernetes-based deployment.

Responsibilities

  • Performs application specific SRE support, RCAs, and service restoration as needed to quickly respond to and resolve production issues.
  • Plan and achieve high availability, performance, and availability of the product service.
  • Ensure proactive monitoring of all core services and processes to prevent un-planned service disruption.
  • Implement self-healing and scalability of technical services to avoid un-planned disruptions.
  • Identifies significant projects that result in substantial improvements in reliability, cost savings and/or revenue.
  • Identifies changes for the product architecture from the reliability, performance and availability perspectives with a data driven approach.
  • Influences the product roadmap and works with engineering and product counterparts to influence improved resiliency and reliability of the product.
  • Proactively work on efficiency and capacity planning to set clear requirements and optimize the system resources usage.
  • Identify Service Level Indicators (SLIs) that will align the team to meet the availability and latency objectives.
  • Provide detailed analysis and troubleshooting for systems outages providing feedback to product/software engineering
  • Leads initiatives and problem definition and scoping, design, and planning through epics and blueprints.
  • Deep domain knowledge and radiation that knowledge through recorded demos, technical presentations, discussions, and
  • Perform and run blameless RCAs on incidents and outages aggressively looking for answers that will prevent the incident from ever happening again.
  • For stable counterpart assignments, maintain awareness and actively influence stage group plans and priorities through participation in stage group meetings and async discussions. Act as a champion for reliability.
  • Set an example for team of SREs with positive and inclusive leadership and discussion on work.

Benefits

  • 18 days PTO
  • 11 Holidays (8 company recognized & 3 floating holidays)
  • 16 hours per year of paid Volunteer Time Off (VTO)
  • Competitive Healthcare
  • High Deductible Heath Plan Option that has $0 monthly premium for teammate-only coverage
  • 100% AvidXchange paid Dental Base Plan Coverage
  • 100% AvidXchange paid Life Insurance
  • 100% AvidXchange paid Long-Term Disability
  • 100% AvidXchange paid Short-Term Disability
  • Employee Assistance Program (EAP) - Provides counseling services, legal and financial consultations and health advocacy for Teammates and their eligible dependents
  • Onsite Health Clinic with Atrium Health - available to Teammates and their eligible dependents
  • 401(k) Match: 100% match on the first 3% of your salary, plus 50% match on the next 2%
  • Parental Leave: 8 weeks 100% paid by AvidXchange
  • Discounts on Pet, Home, and Auto insurance
  • WeeCare Childcare Service: helps teammates find affordable daycare, childcare, and tutors 40% less expensive than traditional daycare centers
  • Perks at Work: free discount program that provides teammates the opportunity to save on items from electronics, movie tickets, car buying, vacations, and more
  • Onsite gym fitness center, yoga studio, and basketball court
  • Tuition Reimbursement up to the federal maximum of $5,250
  • Hybrid Workplace Flexibility
  • Free parking
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service