Platform Engineer II

Upbound GroupPlano, TX
Onsite

About The Position

The Platform Engineering II is a lead individual contributor responsible for driving the development, maintenance, and optimization of platform infrastructure and services. This individual will take ownership of significant platform initiatives, lead design efforts for new features, and actively mentor junior engineers. They will play a critical role in evolving our platform's reliability, efficiency, and security, and in defining and implementing best practices that significantly accelerate the Software Development Life Cycle (SDLC). This role also serves as a key builder of AI-powered platform capabilities, including the design, integration, and ongoing maintenance of AI components embedded in the CI/CD pipeline.

Requirements

  • Bachelor's degree in Computer Science, Engineering, or a related field, or equivalent practical experience.
  • 6+ years of progressive experience in Platform Engineering, DevOps, or SRE, with demonstrated leadership in technical initiatives.
  • 2+ years of working with AI/LLM-powered components
  • Deep expertise in cloud computing platforms (AWS and/or GCP), including advanced networking, compute, and security services.
  • Mastery of infrastructure as code (Terraform, CloudFormation), including developing and managing complex IaC repositories.
  • Extensive experience with advanced CI/CD pipeline automation (GitHub Actions, GitLab CI, Argo CD, Jenkins).
  • Expert-level knowledge of containerization (Docker) and orchestration (Kubernetes), including cluster management and optimization.
  • Highly proficient in scripting/programming languages (e.g., Python, Go, Bash) for automation, tool development, and API integration.
  • Proven experience in designing and implementing comprehensive observability stacks (Grafana, Prometheus, New Relic, OpenTelemetry, ELK, Splunk).
  • Strong understanding of enterprise networking, security best practices, and compliance frameworks.
  • Excellent leadership, mentoring, analytical, and communication skills, with the ability to drive technical consensus.
  • Production experience building and operating LLM-powered pipelines or AI agents, with hands-on knowledge of MCP server integration or equivalent agentic AI tooling.
  • Experience instrumenting AI pipelines for observability, quality measurement, and continuous improvement through data-driven feedback loops.

Nice To Haves

  • Experience integrating LLMs with issue tracking systems such as Jira or Linear, particularly for ticket-to-code automation workflows.
  • Background in DORA metrics, engineering productivity measurement, or developer experience (DevEx) platform design.
  • Experience with Developer Portal tooling such as Backstage or equivalent internal developer platform frameworks.
  • Familiarity with agentic AI systems that take multi-step actions autonomously and recover gracefully from failures.
  • Familiarity with responsible AI practices: output validation, guardrails, and human-in-the-loop review workflows.

Responsibilities

  • Lead the design, implementation, and management of complex, scalable, and secure platform components across cloud environments (e.g., AWS, GCP).
  • Drive the adoption and continuous improvement of infrastructure as code (IaC) practices, developing reusable modules and patterns.
  • Architect and implement advanced CI/CD pipelines, including GitOps principles and progressive delivery strategies.
  • Design, implement, and optimize comprehensive observability solutions, ensuring robust monitoring, logging, tracing, and alerting.
  • Provide expert-level troubleshooting and perform root cause analysis for critical platform issues.
  • Lead architectural discussions and propose innovative solutions to complex platform challenges.
  • Ensure high availability, disaster recovery, and performance of critical platform services through proactive engineering.
  • Collaborate extensively with DevEx, development and security teams, acting as a technical liaison to integrate applications with platform services and provide strategic guidance.
  • Mentor junior and mid-level engineers, fostering their technical growth and promoting best practices within the team.
  • Evaluate and recommend new technologies, tools, and processes to enhance platform capabilities.
  • Design, build, and operate AI-powered components integrated into the CI/CD pipeline, including LLM and MCP-based automation capabilities.
  • Build and maintain the Developer Portal, including service scaffolding, golden path templates, and adoption metrics to improve developer experience

Benefits

  • Competitive compensation
  • Full health benefits-Medical/Dental/Vision
  • 401(k) match, (5%/4%)
  • DTO (discretionary time off)
  • Health savings account (HSA) with company contribution
  • College tuition reimbursement program (STEM degrees)
  • Unlimited use of LinkedIn Learning
  • On-site gym and showers
  • Free car charging and covered parking
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service