PennyMac-posted 2 months ago
$95,000 - $155,000/Yr
Full-time • Mid Level
Westlake Village, CA
Credit Intermediation and Related Activities

The Sr DevOps Engineer - AI platform will design, implement, and manage scalable and resilient infrastructure on AWS. This role involves architecting and maintaining Windows/Linux based environments, ensuring seamless integration with cloud platforms. The engineer will develop and maintain infrastructure-as-code (IaC) using both AWS Cloudformation/CDK and Terraform/OpenTofu, as well as Configuration Management for Windows & Linux servers using Chef. Additionally, the role includes designing, building, and optimizing CI/CD pipelines using GitLab CI/CD for .NET applications, integrating and supporting AI services, and enabling AI/ML workflows. The engineer will also implement observability, security, data privacy, and cost-optimization strategies for AI workloads, collaborate with development teams, troubleshoot issues, and contribute to the development of DevOps standards and best practices.

  • Design, implement, and manage scalable and resilient infrastructure on AWS.
  • Architect and maintain Windows/Linux based environments.
  • Develop and maintain infrastructure-as-code (IaC) using AWS Cloudformation/CDK and Terraform/OpenTofu.
  • Develop and maintain Configuration Management for Windows & Linux servers using Chef.
  • Design, build, and optimize CI/CD pipelines using GitLab CI/CD for .NET applications.
  • Integrate and support AI services, including orchestration with AWS Bedrock and Google Agentspace.
  • Enable AI/ML workflows by building and optimizing infrastructure pipelines.
  • Automate model lifecycle management through CI/CD pipelines.
  • Collaborate with AI engineering teams to deliver scalable environments and infrastructure.
  • Implement observability, security, data privacy, and cost-optimization strategies for AI workloads.
  • Implement and enforce security best practices across the infrastructure.
  • Collaborate closely with development teams to understand their needs.
  • Troubleshoot and resolve infrastructure and application deployment issues.
  • Implement and manage monitoring and logging solutions.
  • Contribute to the development and documentation of DevOps standards and best practices.
  • Stay up-to-date with the latest industry trends and technologies.
  • Provide mentorship and guidance to junior team members.
  • Bachelor's degree in Computer Science, Engineering, or a related field (or equivalent experience).
  • 5+ years of experience in a DevOps or Site Reliability Engineering (SRE) role.
  • 1+ year(s) of experience with AI services & LLMs.
  • Extensive hands-on experience with Amazon Web Services (AWS).
  • Solid understanding of Windows/Linux Server administration and integration with cloud environments.
  • Proven experience with infrastructure-as-code tools, specifically AWS CDK and Terraform.
  • Strong experience designing and implementing CI/CD pipelines using GitLab CI/CD.
  • Experience deploying and managing .NET applications in cloud environments.
  • Deep understanding of security best practices and their implementation in cloud infrastructure.
  • Solid understanding of networking principles in cloud environments.
  • Experience with monitoring and logging tools (e.g., NewRelic, CloudWatch).
  • Strong scripting skills (e.g., PowerShell, Python, Ruby, Bash).
  • Excellent problem-solving and troubleshooting skills.
  • Strong communication and collaboration skills.
  • Experience with containerization technologies (e.g., Docker, Kubernetes) is a plus.
  • Relevant AWS and/or GCP certifications are a plus.
  • Experience with the configuration management tool Chef.
  • Knowledge of and a strong understanding of Powershell and Python Scripting.
  • Strong background with AWS EC2 features and Services (Autoscaling and WarmPools).
  • Understanding of Windows server Build process using tools like Chocolaty and Packer.
  • Comprehensive Medical, Dental, and Vision.
  • Paid Time Off Programs including vacation, holidays, illness, and parental leave.
  • Wellness Programs, Employee Recognition Programs, and onsite gyms and cafe style dining.
  • Retirement benefits, life insurance, 401k match, and tuition reimbursement.
  • Philanthropy Programs including matching gifts, volunteer grants, charitable grants and corporate sponsorships.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service