As a Lead DevOps Engineer, you will: Lead the design, implementation, and optimization of our DevOps practices while enabling our teams to deliver reliable, scalable solutions. This role is crucial in building, automating, and maintaining our cloud infrastructure, CI/CD pipelines, and ensuring the reliability and scalability of our applications. You'll play a key part in fostering a culture of operational excellence, security, and continuous delivery, working closely with development and product teams. Infrastructure as Code (IaC): Design, implement, and manage scalable, secure, and highly available cloud infrastructure primarily on AWS, with an understanding of best practices for GCP environments. (e.g., using Terraform, CloudFormation). Automation & CI/CD: Develop and maintain robust CI/CD pipelines (e.g., GitLab CI/CD, GitHub Actions, Jenkins, AWS CodePipeline) to automate software delivery, testing, and deployment processes. Linux System Administration: Provide expert-level administration, troubleshooting, and optimization for Linux-based systems, ensuring stability, security, and performance. Monitoring & Observability: Implement comprehensive monitoring, logging, and alerting solutions (e.g., Prometheus, Grafana, ELK Stack, CloudWatch, DataDog) to ensure application health, performance, and proactive issue detection. Networking & Security: Configure and manage cloud networking components (VPCs, subnets, routing, security groups, firewalls) and implement security best practices (IAM, encryption, least privilege). Troubleshooting & Incident Response: Act as a subject matter expert for production issues, performing root cause analysis, implementing preventative measures, and participating in on-call rotations (as required). Collaboration: Work closely with software engineers, data scientists, and product managers to understand their needs and provide reliable, efficient, and secure infrastructure solutions. Continuous Improvement: Identify and implement improvements to existing systems, tools, and processes to enhance efficiency, reduce costs, and improve reliability. Documentation: Create and maintain clear, concise documentation for infrastructure, processes, and playbooks.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Mid Level
Education Level
No Education Listed