Cloud Platform Engineer

Farmers Insurance Careers
8hHybrid

About The Position

The Cloud Platform Operations team at Farmers is responsible for the analysis, design, implementation, and operational support of multi-cloud infrastructure across GCP (Google Cloud Platform), Azure (Microsoft cloud), and AWS. This team ensures platform availability, scalability and compliance of multi-clouds while driving AIOps /FinOps and autonomous infrastructure management The role focuses on leveraging Multi-Cloud AI offerings to drive operational reliability, predictive monitoring, and continuous optimization to maintain high availability and resilience in production environments. This role Ensures all solutions adhere to established architectural design patterns, governance standards and security controls for high complexity configurations. Sr Cloud platform engineer Specializes in methodologies, various programming languages, Infrastructure as Code platforms, and designing and operating AI-optimized cloud environments

Requirements

  • 5 years of experience in public cloud operations (AWS, Azure, GCP), with a strong focus on AIOps integration.
  • Deep, demonstrable expertise designing and operationalizing solutions leveraging AWS Bedrock/Agent Frameworks and Azure Copilot for Cloud Operations.
  • Expertise in Infrastructure as Code (Terraform, CloudFormation), Ansible, and CI/CD pipelines, including supervising AI-generated infrastructure artifacts.
  • Proficiency in scripting languages (Python, Bash)
  • Expertise in integrating observability platforms (Dynatrace, Prometheus) into AI/ML platforms for predictive analysis and anomaly detection.
  • Understanding of Site Reliability Engineering (SRE) and operational reliability principles.
  • Experience with monitoring tools (CloudWatch, Prometheus, Dynatrace, azure monitor) and ServiceNow.
  • Ability to adapt to new cloud releases and emerging technologies.
  • Excellent troubleshooting, problem-solving, and communication skills.
  • Bachelor’s degree in computer science or other technical discipline.

Nice To Haves

  • Hands-On experience on Jboss, Tomcat, IBM WAS, Oracle, DB2, MSSQL and Postgres is a plus.
  • Technical Certifications are a plus.

Responsibilities

  • Ensure availability, scalability, reliability and security of cloud platforms and services.
  • Design, deploy, and govern AI-powered agents (e.g., using Azure Copilot /AWS Bedrock) to achieve autonomous self-healing capabilities and automated resource management.
  • Handle regular operational requests with hands-on experience using Terraform for EC2 changes, S3 updates, user access management, and managed services like SageMaker, Bedrock, Storage Gateway, RDS, and Transfer Family etc.
  • Supervise and refine AI-generated Infrastructure-as-Code (IaC) (Terraform/Ansible) for Developing and maintaining complex and scalable Terraform/ansible/CloudBees (Jenkins) automation pipelines to provision, deploy, patch and manage cloud infrastructure.
  • Implementing AI based automation solutions for Cloud Operations to Monitor performance, scalability and respond to incidents and operational issues autonomously.
  • Implement GenAI tools to perform real-time Root Cause Analysis (RCA), correlate complex event data (logs, metrics), and auto-generate runbooks and incident summaries.
  • Develop and train predictive ML models to analyze historical telemetry and forecast potential system outages or performance bottlenecks and configure proactive monitoring and alerting for critical services.
  • Manage security and compliance by utilizing AI agents to detect configuration drift and auto-generate compliant updates for IAM, network, and security policies.
  • Manage security and compliance by remediating vulnerabilities, configuring notifications, supporting audits, and maintaining certifications and governance standards.
  • Collaborate with application, architecture, AIOps, FinOps and security teams to ensure production readiness.
  • Lead proof-of-concepts, implement new cloud services and Adapt quickly to new cloud releases and features to enhance operational capabilities.
  • Understand middleware components to provide end-to-end production support and troubleshoot complex operational scenarios.
  • Works with application teams, analyzing logs and data, opens service requests, works with vendors, partners, and drives problem resolution.
  • Reviews architectural designs for applications to ensure reliable and performant design patterns are implemented.
  • Hands-on experience in deploying applications, workloads, and data to the cloud environment, often involving migration from on-premises infrastructure or other cloud providers.
  • Advanced experience working with Finance and procurement team and implement cost optimization strategies based on changing workload patterns, business requirements, and new offerings from cloud providers.

Benefits

  • Bonus Opportunity (based on Company and Individual Performance)
  • 401(k)
  • Medical
  • Dental
  • Vision
  • Health Savings and Flexible Spending Accounts
  • Life Insurance
  • Paid Time Off
  • Paid Parental Leave
  • Tuition Assistance
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service