DAT Freight & Analytics-posted 4 months ago
Full-time • Mid Level
Hybrid • Seattle, WA
501-1,000 employees
Computing Infrastructure Providers, Data Processing, Web Hosting, and Related Services

DAT is looking for a Site Reliability Engineer to join our SRE platform team. This position will work hybrid or remote in Seattle, WA. DAT is seeking an experienced Site Reliability Engineer to help grow our SRE practices. In this role, you will be responsible for contributing to technical initiatives and enhancing your skills. You'll work closely with development teams and platform architects to achieve critical reliability goals and help scale our platform. The SRE will be responsible for ensuring the stability, performance, and scalability of our systems, implementing robust monitoring solutions, automating operational tasks, and proactively identifying and resolving potential issues. This will involve a deep understanding of distributed systems, cloud infrastructure, and a commitment to best practices in site reliability engineering.

  • Contribute to the design, implementation, and maintenance of scalable and reliable systems.
  • Collaborate with engineering teams to ensure reliability targets are met.
  • Identify and troubleshoot complex issues across distributed systems, ensuring minimal downtime and optimal performance.
  • Advocate for and implement SRE best practices, including automation, monitoring, and incident response, to enhance system resilience.
  • Participate in capacity planning and performance tuning to proactively address potential bottlenecks and support future growth.
  • Leverage new AI tools to assist with coding and observability tasks.
  • Assist and respond to critical engineering incidents.
  • Improve your engineering skills within the SRE team.
  • Provide technical guidance and best practices for use of cloud infrastructure and tooling.
  • Contribute to Infrastructure-as-Code within the platform.
  • Contribute to reliability-focused initiatives and projects.
  • Help optimize our work to be customer-focused.
  • Assist in migrating legacy systems to modern, scalable cloud environments.
  • Help develop and drive a culture of continuous improvement with the Platform Engineering and Software Engineering groups.
  • Participate in an on-call rotation.
  • Strong collaboration and problem-solving abilities, especially within SRE or Platform Engineering/Infrastructure teams.
  • Total of 2 to 4+ years industry experience.
  • At least 1 year of software engineering experience (JavaScript, Python, Go, Java/Kotlin, C++, etc).
  • Experience with modern observability tools (Datadog preferred).
  • Experience with cloud platforms (preferably AWS).
  • Demonstrated success in contributing to large technical initiatives.
  • Proven experience assisting in modernizing legacy code and infrastructure.
  • Ability to work closely with peer teams, platform/software architects and management.
  • Willingness to share your expertise among team members.
  • Understanding of cloud infrastructure, automation, and best practices for reliability.
  • Experience with our tools (Kubernetes, ArgoCD, Terraform, Github Actions) a plus.
  • Medical, Dental, Vision, Life, and AD&D insurance
  • Parental Leave
  • Up to 20 days of paid time off starting in year one
  • An additional 10 holidays of paid time off per calendar year
  • 401k matching (immediately vested)
  • Employee Stock Purchase Plan
  • Short- and Long-term disability sick leave
  • Flexible Spending Accounts
  • Health Savings Accounts
  • Tuition Reimbursement Program
  • Employee Assistance Program
  • Additional programs - Employee Referral, Internal Recognition, and Wellness
  • Free TriMet transit pass (Beaverton Office)
  • Competitive salary and benefits package
  • Work on impactful projects in a cutting-edge environment
  • Collaborative and supportive team culture
  • Opportunity to make a real difference in the trucking industry
  • Employee Resource Groups
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service