Senior Site Reliability Engineer

Caterpillar Inc.Peoria, IL
5d

About The Position

As a Site Reliability Engineer, you will be responsible for ensuring the reliability, availability, and performance of our D365 ERP systems, connectivity, and infrastructure. You will collaborate with cross-functional teams to develop and implement strategies to improve system stability, automate repetitive tasks, and enhance service delivery and performance. If you have a passion for delivering reliable, high-performance services and thrive in a fast-paced environment, we'd love to hear from you. Apply now to join our team as a Site Reliability Engineer!

Requirements

  • Effective Communications: Strong understanding of communication concepts, tools and techniques; ability to effectively transmit, receive, and accurately interpret ideas, information, and needs through the application of appropriate communication behaviors.
  • Technical Troubleshooting: Extensive knowledge of technical troubleshooting approaches, tools and techniques; ability to anticipate, recognize, and resolve technical issues on hardware, software, application or operation.
  • Performance Measurement and Tuning: Knowledge of system performance, testing and programming; ability to monitor, measure, and optimize system performance and network communication.
  • Software Release Management: Knowledge of strategies, practices and tools for managing versions and distribution of software products and enhancements; ability to evaluate and improve release management practices and tools.
  • Software Reliability Management: Knowledge of software reliability management; ability to develop and use principles, methodologies and metrics that increase software product performance and reliability.
  • Bachelor's degree in Computer Science, Information Technology, a related field, or equivalent experience.
  • 6+ years of experience in site reliability engineering, DevOps, QA, or a related field.
  • Strong experience with Microsoft D365 or general Azure based services
  • Experience with AWS infrastructure and services
  • Experience with IaC solutions like Cloudformation and Terraform
  • Experience with CI/CD solutions - Github, Azure DevOps
  • Strong troubleshooting and critical thinking skills
  • 6+ years of experience and proficiency in one or more programming languages, such as Python (preferred), Javascript (preferred).
  • Solid understanding of networking, load balancing, on prem hosting solutions, and web application architectures.
  • Experience with containerization technologies, such as Docker and Kubernetes.
  • Excellent problem-solving skills and a strong attention to detail.
  • Strong IT and Business communication skills and ability to collaborate effectively with cross-functional teams.

Responsibilities

  • Monitor and troubleshoot production and QA systems to identify and resolve performance, scalability, and reliability issues proactively.
  • Participate in the on-call rotation to provide 24/7 critical incident support for eCommerce platform systems
  • Design, implement, and maintain automated processes and tools to streamline deployment and release processes.
  • Collaborate with cross-functional teams to define, document, and implement operational processes, best practices, and procedures.
  • Implement and maintain system monitoring tools and dashboards to provide real-time insights into system performance and identify potential issues.
  • Work closely with developers to identify and fix bugs and performance bottlenecks in the application code.
  • Ensure that systems and infrastructure comply with security, compliance, and regulatory requirements.
  • Continuously evaluate systems and processes to identify areas for improvement and implement changes as needed.

Benefits

  • Medical, dental, and vision benefits
  • Paid time off plan (Vacation, Holidays, Volunteer, etc.)
  • 401(k) savings plans
  • Health Savings Account (HSA)
  • Flexible Spending Accounts (FSAs)
  • Health Lifestyle Programs
  • Employee Assistance Program
  • Voluntary Benefits and Employee Discounts
  • Career Development
  • Incentive bonus
  • Disability benefits
  • Life Insurance
  • Parental leave
  • Adoption benefits
  • Tuition Reimbursement
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service