Senior Site Reliability Engineer

SimCorpToronto, ON
Hybrid

About The Position

As a Senior Site Reliability Engineer, you will be working on Cloud Native Products & Services, taking ownership of various responsibility domains like monitoring, observability, release management, vulnerability management, cost management, audit & compliance etc. You will work closely with DevOps engineers, clients, and stakeholders to ensure reliability, performance, and automation for both existing and new cloud native products & services. Onboard and long-running clients on them. Your contributions will drive stability, continuous improvement, and operational excellence in our Azure-based environments. This role blends hands-on engineering, incident response, platform configuration, and service quality, - guided by ITIL and SRE best practices.

Requirements

  • Bachelor’s degree in Computer Science or related field (Master’s is a plus)
  • 3+ years in Site Reliability, DevOps, or Cloud Engineering roles
  • Must have expertise with Microsoft Azure Cloud.
  • Expertise in Infrastructure as Code (IaC) using Bicep, ARM and Terraform
  • Solid experience in monitoring and logging tools (Azure Monitor, Application Insights, DataDog, Log Analytics).
  • Hand-on experience in IdP Onboarding and integrating, configuring IdP solutions like Azure Entra ID, Okta, KeyCloak or PingFederate
  • Experience in centralizing authentication, managing user identities, and implementing secure access protocols (SAML, OAuth, OIDC)
  • Experience working with observability frameworks like Open Telemetry and distributed tracing systems
  • Experience working with application reliability platforms like Checkly or equivalent
  • Experience setting up synthetic monitoring using Playwright or equivalent
  • Knowledge of AI/ML-based anomaly detection, log aggregation and analysis tools like Microsoft Azure Anomaly Detector or equivalent.
  • Experience working with Microsoft Defender Suite (EDR, XDR) and Sentinel.
  • Proficient in KQL for threat hunting and improving compliance scores using Defender for Cloud.
  • Able to identify and remediate vulnerabilities
  • Understanding of networking, containerization (Kubernetes, Docker)
  • Good understanding of APIs, scripting languages like PowerShell, Bash, Kusto and databases like SQL, Cosmos DB and Postgres SQL
  • Proficiency in IT service management (ITSM) frameworks like ITIL, focusing on incident, change, and problem management to improve operational efficiency
  • Experience managing both onboarding projects and live production operations
  • Collaborative mindset and ability to work in cross-functional teams
  • Interest in continuous learning and growth within your Product Area

Nice To Haves

  • Familiarity with SimCorp Dimension & Sales force is a plus

Responsibilities

  • Support the operational and enhancement of mission-critical environments for both new and existing Cloud Native products & services
  • Collaborate with product development teams to enhance monitoring, observability, reliability, and performance of these services.
  • Collaborate deeply across engineering teams to understand systems at the code level.
  • Manage & improve our infrastructure deployment pipelines and troubleshoot onboarding and operational issues
  • Drive capacity planning efforts to ensure our platform is resilient and scalable as we grow.
  • Build tools and automation to eliminate manual TOIL, improve engineering velocity, developer experience, and improve system reliability.
  • Define and manage SLOs and error budgets in partnership with Engineering teams.
  • Contribute to incidents, problems, and change management processes.
  • Execute disaster recovery, configuration management, and platform readiness tasks.
  • Flexible working in regular & evening shift on rotational basis and providing weekend or On-Call support as needed.
  • Collaborate with Agile teams and take part in design discussions with clients, vendors, and stakeholders.
  • Contribute to knowledge sharing across multiple Product Areas.
  • Leverage a strong foundation in ITIL practices, including problem, change, and incident management.

Benefits

  • Flexible working hours and hybrid model - 2 days in the office, 3 days remote
  • Modern office (next to Wilanowska metro station) with quiet zones and ergonomic workstations
  • Base salary with annual bonus structure
  • Holiday allowance upon a 2-week vacation
  • Occasional remote work across Poland and international options available (up to 24 days domestic, 20 days international per year - subject to internal policy)
  • Employer-paid Medicover Platinum healthcare package (employee-paid family upgrade available)
  • Multisport card with 75% employer contribution
  • Unum group life insurance (employee-paid upgrade available)
  • Medicover travel insurance for private trips and global business travel insurance
  • Possibility to join Deutsche Börse Group Share Plan (eligible after 1 year)
  • Copyrights program for certain roles
  • Possibility to develop your career in an international environment
  • Professional training and courses provided by SimCorp
  • Polish language classes for foreign employees; German and English classes based on business needs
  • Integration events, volunteering initiatives and employee-led clubs
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service