Principal System Engineering

AT&TPlano, TX
$174,100 - $261,100Onsite

About The Position

The Principal System Engineering role is responsible for partnering with various Consumer Technology Experience (CTx), Development, Operations/Infrastructure, and Solution Architect teams to design a comprehensive Performance Engineering framework aligned with business needs. This role involves planning, designing, and rolling out Performance Engineering Test solutions to prevent disruption and re-engineering costs by detecting defects early and building performance into projects from the outset. The position directly oversees cloud platform planning, creation, and maintenance of applications on public cloud instances. The ideal candidate will possess expert-level knowledge of performance and cloud engineering methodologies, strategies, and technologies. This role also includes mentoring junior engineers and guiding peer team members in test engineering and quality strategy. The Principal System Engineer will design robust Performance Test Strategies to validate the impact of changes on individual applications and the broader consumer technology ecosystem. They will participate in solution design to ensure performance is a core consideration and collaborate with vendor/partner teams for load test execution according to the software delivery schedule. Reviewing performance test results to ensure infrastructure and application responsiveness are evaluated for production readiness is also a key responsibility. This role provides SME leadership within the Consumer Quality Engineering (CQE) team on transformative efforts and implements best practices in Performance Testing. Additionally, the role involves designing and implementing automated Performance Testing within the CI/CD pipeline, defining and implementing post-mortem/root-cause analysis processes, and developing improved testing scenarios based on analysis. Performing workload analysis, future growth analysis, and application endurance certification using automated client scripts and performance assurance tools is required. The engineer will implement automated utilities for capturing application transaction traces, sessions, browser network logs, and response times, validating this data against ELK and Dynatrace. Understanding applications and technologies, onboarding new applications by testing and validating them in lower environments, identifying gaps/anomalies, and providing root causes are essential. Collaboration with vendor support for issue resolution or enhancements, and working with application teams for code changes or metric capture are also part of the role. Developing documentation, ensuring systems meet requirements, and participating in issue identification, analysis, and resolution are expected. The role includes building service simulations using Broadcom Dev Test Service Virtualization and Mock Server, creating hypotheses based on production outages, and designing/executing chaos scenarios using Gremlin to test application resiliency. Recommendations based on chaos experiment analysis will be provided. Collaboration with developers, operations, and security teams to understand system architecture and identify weaknesses is crucial. A detailed volumetric analysis and derivation of a Workload Model are also required.

Requirements

  • Bachelor’s degree, or foreign equivalent degree in Electronic Engineering.
  • 5 Years of progressive, post-baccalaureate experience in the job offered or 5 Years of progressive, post-baccalaureate experience in a related occupation utilizing experience for creating and maintaining applications in cloud engineering.
  • Proficient experience in automated Performance validation in the software delivery through the CI/CD pipeline.
  • Experience with microservices architecture and containerization technologies like Docker and Kubernetes.
  • Experience in analysis, design, estimation, project planning and development of CQE solutions for CTX applications in Java along with stakeholder interaction.
  • Experience gathering client’s business requirements and checking the feasibility of the same with proper gap analysis.
  • Experience with performance testing using automated testing tools which generates and executes the test cases and test data.
  • Experience preparing Technical/Business reports and supporting bug fixes and issues reported in performance testing Phase of a particular release and weekly status reports.

Nice To Haves

  • Expert level working knowledge of all key methodologies, strategies, technology, and experience of a performance & cloud engineer.

Responsibilities

  • Partnering with CTx, Development, Operations/Infrastructure, and Solution Architect teams in designing a comprehensive Performance Engineering framework.
  • Planning, designing, and rolling out Performance Engineering Test solutions.
  • Directly responsible for cloud platform planning, creation, & maintenance of applications residing on our public cloud instances.
  • Mentoring junior level resources and serving as a guide for peer team members.
  • Designing robust Performance Test Strategies to validate impact of changes to performance of individual applications as well as the consumer technology ecosystem.
  • Participating and providing recommendations to ensure Performance is built into the design of a solution.
  • Working closely with Vendor/Partner teams for execution of load tests based on software delivery/release schedule.
  • Reviewing performance test results and ensuring all aspects of infrastructure and application responsiveness are evaluated for delivery to production.
  • Providing key SME leadership within Consumer Quality Engineering (CQE) team on Transformative Efforts and implementing best industry practices to Performance Testing.
  • Designing and implementing automated Performance Testing in the CI/CD pipeline.
  • Defining and implementing post-mortem / root-cause analysis processes – developing improved testing scenarios based upon analysis.
  • Performing workload, future growth analysis and application endurance certification through automated client scripts using performance assurance tools for automating applications user flows.
  • Implementing an automated utility to capturing application transaction traces, sessions, browser network logs, response time for each action performed on the application.
  • Understanding the application, technologies and onboarding a new application by testing and validating them in lower environments.
  • Identifying the gaps/anomalies and providing root cause for the issues identified.
  • Working with vendor support and providing required details for resolution of issues or enhancements needed for application.
  • Working with the application team to capture additional metrics, meta data from the application pages.
  • Working with the application team for code changes where manual injections are needed.
  • Developing documentation (as required) on new or existing systems.
  • Ensuring systems meet documented user requirements.
  • Participating in identification, analysis, and resolution of identified issues.
  • Building service simulations of project blocking and overhead services on private premises cloud using Broadcom Dev Test Service Virtualization and Mock Server.
  • Creating hypothesis based on production outages in reactive approach.
  • Designing chaos scenarios for critical applications in a proactive approach.
  • Executing various chaos scenarios using Gremlin tool to target System Resources, System State, and Network.
  • Providing recommendations based on chaos experiments analysis.
  • Collaborating with other teams, including developers, operations, and security, to understand the system's architecture and identify potential weaknesses.
  • Conducting a detailed volumetric analysis and deriving a Workload Model.

Benefits

  • Medical/Dental/Vision coverage
  • 401(k) plan
  • Tuition reimbursement program
  • Paid Time Off and Holidays (based on date of hire, at least 23 days of vacation each year and 9 company-designated holidays)
  • Paid Parental Leave
  • Paid Caregiver Leave
  • Additional sick leave beyond what state and local law require may be available but is unprotected
  • Adoption Reimbursement
  • Disability Benefits (short term and long term)
  • Life and Accidental Death Insurance
  • Supplemental benefit programs: critical illness/accident hospital indemnity/group legal
  • Employee Assistance Programs (EAP)
  • Extensive employee wellness programs
  • Employee discounts up to 50% off on eligible AT&T mobility plans and accessories, AT&T internet (and fiber where available) and AT&T phone
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service