Software Engineer – Service Reliability & Observability (.NET)

Blueprint TechnologiesRedmond, WA
$50 - $55Remote

About The Position

In this role, you will be a hands-on engineer responsible for maintaining and improving a customer-facing data service in a production environment. You will focus on ensuring system reliability, security, and performance by addressing live-site issues, implementing fixes, and enhancing observability through logging, metrics, and tracing. You will work closely with cross-functional teams including engineering, support, and security to resolve incidents, perform root-cause analysis, and continuously improve service stability. This role also involves automating operational processes and contributing to long-term service resilience, making it ideal for someone who enjoys working on real-time systems and driving operational excellence.

Requirements

  • Bachelor’s degree in Computer Science, Engineering, or related field, or equivalent practical experience
  • Strong experience developing and maintaining production services using C#/.NET
  • Experience troubleshooting and resolving live-site issues in customer-facing applications
  • Hands-on experience with observability tools (logging, metrics, distributed tracing, alerting)
  • Familiarity with secure coding practices and handling security vulnerabilities in production systems
  • Experience automating operational workflows using scripts or tooling
  • Ability to collaborate effectively across engineering, support, and security teams during incident response
  • Strong analytical and problem-solving skills with a focus on root-cause analysis
  • Clear communication skills for documenting incidents, writing postmortems, and providing updates

Nice To Haves

  • Experience supporting large-scale or enterprise data services/APIs with high reliability requirements
  • Familiarity with modern frontend technologies (e.g., React) for debugging or minor enhancements
  • Experience with cloud platforms (e.g., Azure) and service monitoring tools
  • Proven experience improving observability and reducing operational toil through automation
  • Understanding of compliance, privacy, and data governance in regulated environments
  • Experience with incident management processes, post-incident reviews, and operational excellence practices
  • Ability to create concise, leadership-ready summaries of incidents, risks, and improvements

Responsibilities

  • Own day-to-day health of a production service, including bug fixes, reliability improvements, and security remediation
  • Investigate and resolve customer-impacting incidents through deep root-cause analysis
  • Implement security fixes and ensure adherence to enterprise security standards
  • Improve service observability by enhancing logging, metrics, tracing, and alerting
  • Build and maintain telemetry pipelines and dashboards for monitoring service health and performance
  • Automate recurring operational and support tasks to reduce manual effort and increase efficiency
  • Contribute to service hardening efforts, including resiliency improvements and failure-mode analysis
  • Maintain and enhance backend services (primarily C#/.NET), with occasional support for frontend components
  • Create and maintain technical documentation for service workflows, behaviors, and known issues
  • Communicate service health and engineering metrics to stakeholders, including incident trends and improvements

Benefits

  • Medical, dental, and vision coverage
  • Flexible Spending Account
  • 401k program
  • Competitive PTO offerings
  • Parental Leave
  • Opportunities for professional growth and development
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service