Site Reliability Engineer

LeidosMinneapolis, MN

About The Position

Come put your Site Reliability Engineer (SRE) skills into action! Leidos has openings for talented SREs to join our team and develop reusable solutions that support our customers in any environment. You will have the opportunity to contribute to the design and implementation of Continuous Integration and Continuous Delivery (CI/CD) pipelines that accelerate the secure delivery of software to production. You will automate the buildout of infrastructure in cloud and on-premises environments to operate Kubernetes clusters and microservices deployments. In this role, you will join dynamic Agile software teams that are singularly focused on providing world-class solutions to our customers in an exciting, collaborative, and inclusive atmosphere. You will be intellectually challenged and provided with a tremendous opportunity for growth in a fast-paced, and fun environment. You’ll learn, master, and improve the Continuous Integration Continuous Delivery (CI/CD) processes and tools we use to develop, test, integrate, and deploy our Cloud-based and on-premises solutions into multiple hosting environments, such as AWS, Azure, VMWare, and others. You’ll learn new technologies and tools and apply what you’ve learned to overcome technological challenges with innovative solutions. You’ll collaborate with other software engineers and SREs to share your knowledge with the team and the organization to make us all better at what we do. You’ll perform technical spikes and develop prototypes to help test product concepts and achieve customer validation.

Requirements

  • Bachelor’s degree in Computer Science, Computer Engineering, or a related field, with 4+ years of relevant experience
  • Demonstrated ability to deliver projects or processes spanning multiple technical domains, including experience in a technical lead capacity
  • Solid understanding of Agile development practices, along with CI/CD methodologies and supporting tools
  • Strong proficiency with Linux and Windows operating systems, as well as networking fundamentals (e.g., HTTP, HTTPS, SSL/TLS, SMTP, DNS)
  • Hands-on experience provisioning and managing resources within cloud and IaaS environments (AWS, Azure, Google Cloud Platform, etc.)
  • Practical experience with infrastructure-as-code and automation tools such as Terraform, Ansible, CloudFormation, Chef, or Puppet
  • Experience working with container technologies (Docker) and orchestration platforms like Kubernetes, including use of kubectl
  • Proficiency with version control systems, such as Git
  • Demonstrated curiosity and initiative in learning new tools, frameworks, and technologies
  • Ability to work independently with minimal supervision while also collaborating effectively within cross-functional engineering teams
  • Travel: Travel will be 50% within the US as well as overseas

Nice To Haves

  • Experience with enterprise event streaming technologies such as Kafka or NATS
  • Familiarity with monitoring and observability tools like Grafana and Prometheus
  • Exposure to service mesh and API gateway technologies (e.g., Istio)
  • Experience with GitOps tools such as Argo CD, Flux CD, or similar platforms
  • Professional cybersecurity certification (e.g., Security+ or equivalent)
  • Understanding of Agile development methodologies and practices
  • Working knowledge of relational database systems such as Oracle, MySQL, PostgreSQL, or SQL Server

Responsibilities

  • Design, develop, troubleshoot, and maintain mission-critical infrastructure across cloud and on-premises environments using infrastructure-as-code (IaC)
  • Build and support scalable, highly available, and secure cloud-native architectures, including Kubernetes clusters and microservices deployments
  • Enable and optimize CI/CD pipelines by applying best practices for automated provisioning, configuration, testing, and deployment
  • Gather and analyze system and application metrics to support performance tuning, capacity planning, and proactive issue resolution
  • Partner with development teams to improve system reliability through rigorous testing, release processes, and continuous improvement initiatives
  • Participate in system design, platform engineering, and technical decision-making to ensure solutions meet functional, performance, and SLA requirements
  • Collaborate across engineering teams and stakeholders to deliver solutions, resolve technical challenges, and coordinate key deliverables
  • Develop prototypes, perform technical spikes, and evaluate new tools or approaches to solve complex technical problems
  • Continuously assess deployed systems and implement improvements to enhance reliability, scalability, and operational efficiency
  • Mentor team members and contribute to knowledge sharing across the organization

Benefits

  • Employment benefits include competitive compensation, Health and Wellness programs, Income Protection, Paid Leave and Retirement.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service