Software Operations Engineer

SambaSafety
Hybrid

About The Position

SambaSafety is seeking a dedicated and skilled Operations Engineer to join our dynamic Software Operations team. In this role, you will be instrumental in maintaining and optimizing the performance of our Software-as-a-Service (SaaS) production and demo environments. This is an exciting opportunity for an engineer who thrives in a fast-paced and innovative environment and who is passionate about ensuring the reliability and scalability of high-availability systems. The Software Operations (SOOP) team serves as the operational backbone between many teams, such as Engineering, Product, SRE, DataOps, Compliance, Gov Relations and Customer Experience. This role requires strong production support skills, excellent operational judgment, and comfort working within structured incident, problem, and service request processes, currently using Jira Service Management, and modern observability platforms such as Dynatrace. This role is not a customer support position; it is a technical operations role focused on production reliability, incident management, and platform stability.

Requirements

  • Proven experience as an Operations Engineer or in a similar role, preferably within a SaaS environment.
  • Demonstrated ability to balance urgency, risk, and stakeholder communication during production issues.
  • Experience supporting SaaS production environments.
  • Experience with Jira Service Management (or similar ticketing platforms) for incident, problem, and service request tracking.
  • Experience with Confluence (or similar knowledge-base tools) for documentation, runbooks, and operational knowledge sharing.
  • Experience with cloud platforms such as AWS, Azure, or Google Cloud.
  • Working knowledge in Linux/Unix administration.
  • Working knowledge of databases and data platforms (SQL, PostgreSQL, Snowflake) for operational support and troubleshooting.
  • Familiarity with Dynatrace or similar observability/monitoring platforms for production troubleshooting and performance analysis.
  • Familiarity with configuration management tools such as Ansible, Puppet, or Chef.
  • Familiarity with ITIL concepts (Incident, Problem, Change, Service Request), with a pragmatic, operations‑focused mindset.
  • Knowledge of containerization technologies like Docker and orchestration tools like Kubernetes.
  • Knowledge of scripting languages such as Python, Bash, or Perl.
  • Understanding of networking concepts and troubleshooting.
  • Strong analytical and problem-solving skills with a proactive approach to addressing issues.
  • Excellent verbal and written communication skills, including the ability to clearly document incidents, investigations, and operational procedures for both technical and non‑technical audiences.
  • Ability to thrive in a fast-paced environment and manage multiple priorities effectively.

Responsibilities

  • Oversee the deployment, monitoring, maintenance, and support of all SaaS production and demo systems and infrastructure.
  • Respond to and resolve production incidents, ensuring minimal downtime and customer impact. Triage, manage, and resolve production incidents and recurring operational problems using ITIL‑aligned practices. Own incident coordination, technical investigation, escalation, and post‑incident documentation. Ensure accurate ticket status, ownership, and timelines within Jira Service Management.
  • Use Jira Service Management as the system of record for incidents, problems, and service requests. Maintain and contribute to operational documentation and runbooks in Confluence. Monitor production health and investigate anomalies using Dynatrace and related observability tools.
  • Analyze system performance and implement tuning improvements to ensure optimal system efficiency and reliability.
  • Develop and maintain automation scripts and tools to improve operational workflows and reduce manual intervention.
  • Partner closely with other teams to coordinate incident response, deployments, escalations, and operational handoffs.
  • Implement and enforce security best practices to protect data and maintain compliance with relevant regulations.
  • Maintain comprehensive documentation of production environments, operational procedures, and system configurations.
  • Identify and recommend process improvements to enhance operational efficiency and system reliability.
  • Participate in an on‑call rotation to support production systems. Ensure production issues are addressed with urgency while maintaining high-quality communication and documentation standards.

Benefits

  • Flexible and generous Paid Time Off and Paid Volunteer Days
  • 401k Employer Match
  • Generous Healthcare Benefits
  • Up to 12 weeks paid time off for maternity leave based on tenure
  • Wellness & Tuition Reimbursement
  • Flexible Work Arrangements
  • Lots of SambaSafety swag & SambaSafety Events

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Mid Level

Education Level

No Education Listed

Number of Employees

101-250 employees

© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service