Site Reliability Engineer

DAT Freight & Analytics-posted 4 months ago

Full-time • Mid Level

Hybrid • Seattle, WA

501-1,000 employees

Computing Infrastructure Providers, Data Processing, Web Hosting, and Related Services

Resume

Match Score

Upload and Match ResumeTrack Jobs with Teal

DAT is looking for a Site Reliability Engineer to join our SRE platform team. This position will work hybrid or remote in Seattle, WA. DAT is seeking an experienced Site Reliability Engineer to help grow our SRE practices. In this role, you will be responsible for contributing to technical initiatives and enhancing your skills. You'll work closely with development teams and platform architects to achieve critical reliability goals and help scale our platform. The SRE will be responsible for ensuring the stability, performance, and scalability of our systems, implementing robust monitoring solutions, automating operational tasks, and proactively identifying and resolving potential issues. This will involve a deep understanding of distributed systems, cloud infrastructure, and a commitment to best practices in site reliability engineering.

Contribute to the design, implementation, and maintenance of scalable and reliable systems.
Collaborate with engineering teams to ensure reliability targets are met.
Identify and troubleshoot complex issues across distributed systems, ensuring minimal downtime and optimal performance.
Advocate for and implement SRE best practices, including automation, monitoring, and incident response, to enhance system resilience.
Participate in capacity planning and performance tuning to proactively address potential bottlenecks and support future growth.
Leverage new AI tools to assist with coding and observability tasks.
Assist and respond to critical engineering incidents.
Improve your engineering skills within the SRE team.
Provide technical guidance and best practices for use of cloud infrastructure and tooling.
Contribute to Infrastructure-as-Code within the platform.
Contribute to reliability-focused initiatives and projects.
Help optimize our work to be customer-focused.
Assist in migrating legacy systems to modern, scalable cloud environments.
Help develop and drive a culture of continuous improvement with the Platform Engineering and Software Engineering groups.
Participate in an on-call rotation.

Strong collaboration and problem-solving abilities, especially within SRE or Platform Engineering/Infrastructure teams.
Total of 2 to 4+ years industry experience.
At least 1 year of software engineering experience (JavaScript, Python, Go, Java/Kotlin, C++, etc).
Experience with modern observability tools (Datadog preferred).
Experience with cloud platforms (preferably AWS).
Demonstrated success in contributing to large technical initiatives.
Proven experience assisting in modernizing legacy code and infrastructure.
Ability to work closely with peer teams, platform/software architects and management.
Willingness to share your expertise among team members.
Understanding of cloud infrastructure, automation, and best practices for reliability.

Experience with our tools (Kubernetes, ArgoCD, Terraform, Github Actions) a plus.

Medical, Dental, Vision, Life, and AD&D insurance
Parental Leave
Up to 20 days of paid time off starting in year one
An additional 10 holidays of paid time off per calendar year
401k matching (immediately vested)
Employee Stock Purchase Plan
Short- and Long-term disability sick leave
Flexible Spending Accounts
Health Savings Accounts
Tuition Reimbursement Program
Employee Assistance Program
Additional programs - Employee Referral, Internal Recognition, and Wellness
Free TriMet transit pass (Beaverton Office)
Competitive salary and benefits package
Work on impactful projects in a cutting-edge environment
Collaborative and supportive team culture
Opportunity to make a real difference in the trucking industry
Employee Resource Groups

Track Jobs with Teal

Job Search Resources

•

AI Resume Builder

•

Site Reliability Engineer Resume Examples

•

Site Reliability Engineer Cover Letter Examples

Site Reliability Engineer

Job Search Resources

Tools

Career Hubs

Guides

Company