Senior AI Engineer

PepsiCoPlano, TX

About The Position

We are seeking a Senior AI Engineer role to build and operate an internal platform that enables teams to deliver agent orchestration solutions with high velocity and low operational risk. The role treats the platform as an internal product—providing self-service golden paths (standardized, end-to-end workflows), reusable templates, and API-driven automation that reduces developer cognitive load and improves delivery consistency. The role owns the developer experience for provisioning environments, deploying services/workflows, policy-compliant runtime exposure via gateways, and operational readiness through observability and SLO-based reliability practices.

Requirements

  • Bachelor’s in ML/CS/Engineering or equivalent experience required.
  • 8 year experience with AI/DS/ML required.
  • Proven experience building internal platforms / developer tooling and operating production services.
  • Platform engineering mindset: internal platform as product; self-service workflows and templates that reduce cognitive load
  • CI/CD + release engineering: reusable pipelines, policy checks, promotion gates, safe rollout/rollback patterns
  • Infrastructure automation: IaC, environment templating, standardization, drift control (tool choice per enterprise standards)
  • Observability & reliability: SLIs/SLOs, runbooks, monitoring/alerting fundamentals
  • Security-by-default: SSO/OIDC concepts, RBAC/ABAC awareness, secrets management hygiene, audit logging expectations
  • Strong stakeholder collaboration and developer enablement (docs, examples, onboarding)

Nice To Haves

  • Master’s preferred

Responsibilities

  • Build and maintain API-driven self-service capabilities that allow delivery teams to deploy and operate services/workflows with reduced lead time and stack complexity.
  • Design and deliver golden paths (paved roads) for common workflows (service creation, environment provisioning, deployment, rollback, observability enablement), reducing cognitive load and improving compliance by default.
  • Provide standardized environment templates (non-prod/prod) with tenant isolation patterns and secure defaults; minimize drift via code-reviewed configuration and drift detection.
  • Implement secure-by-default platform guardrails (approved base images, default policies, standard logging/metrics/traces) to reduce operational risk and audit gaps.
  • Provide reusable CI/CD templates and promotion gates (build, test, security scans, deploy) that delivery teams can adopt with minimal customization, increasing delivery consistency.
  • Establish standardized rollout/rollback practices and deployment verification steps as part of the golden paths (release safety as a default).
  • Define and operationalize SLIs/SLOs for platform services; ensure telemetry export (metrics/logs/traces) and actionable runbooks for incidents and rollbacks.
  • Partner with SRE/operations to support availability, latency, performance, change management, monitoring/alerting, and emergency response for platform services.
  • Run continuous feedback loops with platform users (product/agent teams) to refine platform experiences, improve golden paths, and publish adoption/usage insights.
  • Deliver enablement: documentation, examples, onboarding playbooks, and office-hours to increase adoption and reduce ticket-based ops.

Benefits

  • Paid parental leave
  • vacation
  • sick
  • bereavement
  • Medical
  • Dental
  • Vision
  • Disability
  • Health
  • Dependent Care Reimbursement Accounts
  • Employee Assistance Program (EAP)
  • Insurance (Accident, Group Legal, Life)
  • Defined Contribution Retirement Plan
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service