AI-Ops Engineer

Avispa TechnologyStanford, CA
2d$60 - $60Hybrid

About The Position

A leading university seeks an AI-Ops Engineer . The successful candidate will be responsible for evolving traditional DevOps into AI- Ops at the Engineering Center. This role leverages AI and machine learning to automate and enhance IT operations. The company offers a family-oriented culture and environment!

Requirements

  • 3+ years of experience in DevOps, SRE, or Cloud Engineering roles.
  • 2+ years of hands-on experience with AWS infrastructure (EC2, ECS, Lambda, S3, IAM, VPC).
  • Experience implementing monitoring, observability, and alerting solutions at scale.
  • Bachelor's degree in Computer Science, DevOps, Cloud Engineering, or a related field (Master's preferred).
  • AWS certification preferred (Solutions Architect, SysOps Administrator, or DevOps Engineer); Professional-level certification a plus.
  • Familiarity with ML/AI concepts and their application to operational automation.
  • Languages: Python (required); Bash, Go, or TypeScript preferred.
  • AIOps & Monitoring: CloudWatch, X-Ray, Prometheus, Grafana, Datadog, or Splunk with ML capabilities.
  • Infrastructure as Code: AWS CloudFormation, Terraform, or AWS CDK.
  • Containers & Orchestration: Docker, AWS ECS/Fargate, Kubernetes (EKS).
  • AWS Services: Lambda, EC2, S3, API Gateway, EventBridge, CloudWatch, IAM, VPC, CodePipeline, SageMaker.
  • CI/CD Tools: GitHub Actions, AWS CodePipeline, Jenkins, or GitLab CI.
  • Data & Analytics: Experience with log aggregation, metrics analysis, and event correlation platforms.
  • Strong understanding of AIOps principles—using AI to enhance, not just support, IT operations is preferred.

Nice To Haves

  • Passion for automation and eliminating manual, repetitive operational tasks is preferred.
  • Excellent problem-solving, debugging, and root cause analysis skills are preferred.
  • Demonstrated ability to learn rapidly, adapt to new technologies, and continuously improve is preferred.
  • Strong communication skills withthe ability to collaborate across technical and non-technical teams are preferred.
  • Commitment to reliability, security, and operational excellence is preferred.
  • Thrives in a fast-paced, evolving environment, proactively seeking opportunities to embed intelligence into systems and processes is preferred.

Responsibilities

  • AI-Driven Operations & Automation Implement AIOps solutions that use ML algorithms to automate performance monitoring, workload scheduling, and infrastructure management.
  • Build anomaly detection systems that identify infrastructure issues before they impact users.
  • Create predictive maintenance workflows that analyze historical patterns to proactively mitigate issues.
  • Observability & Intelligent Monitoring: Architect comprehensive observability platforms that aggregate data from disparate sources into unified dashboards.
  • Implement intelligent alerting systems using NLP and ML to reduce alert fatigue and surface actionable insights.
  • Deploy application performance monitoring (APM) solutions integrated with AI-driven analytics.
  • Ensure end-to-end visibility across cloud infrastructure, applications, and AI/ML workloads.
  • Cloud Infrastructure & DevOps: Design, build, and maintain scalable, secure AWS infrastructure using Infrastructure as Code (CloudFormation, Terraform, or CDK).
  • Implement and manage containerized environments using Docker, AWS ECS, Fargate, and Kubernetes (EKS).
  • Build CI/CD pipelines for continuous delivery, integrating AI-powered code quality and deployment optimization.
  • Collaboration & Continuous Improvement: Partner with cross-functional teams to implement domain-agnostic AIOps solutions across the organization.
  • Use Git-based version control and code review best practices as part of a collaborative, agile workflow.
  • Document operational procedures, runbooks, and AIOps workflows for team knowledge sharing.
  • Occasional on-call responsibilities for critical infrastructure.

Benefits

  • Group Medical
  • Dental
  • Vision
  • Life
  • Retirement Savings Program
  • PSL
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service