The Principal System Engineering role is responsible for partnering with various Consumer Technology Experience (CTx), Development, Operations/Infrastructure, and Solution Architect teams to design a comprehensive Performance Engineering framework aligned with business needs. This role involves planning, designing, and rolling out Performance Engineering Test solutions to prevent disruption and re-engineering costs by detecting defects early and building performance into projects from the outset. The position directly oversees cloud platform planning, creation, and maintenance of applications on public cloud instances. The ideal candidate will possess expert-level knowledge of performance and cloud engineering methodologies, strategies, and technologies. This role also includes mentoring junior engineers and guiding peer team members in test engineering and quality strategy. The Principal System Engineer will design robust Performance Test Strategies to validate the impact of changes on individual applications and the broader consumer technology ecosystem. They will participate in solution design to ensure performance is a core consideration and collaborate with vendor/partner teams for load test execution according to the software delivery schedule. Reviewing performance test results to ensure infrastructure and application responsiveness are evaluated for production readiness is also a key responsibility. This role provides SME leadership within the Consumer Quality Engineering (CQE) team on transformative efforts and implements best practices in Performance Testing. Additionally, the role involves designing and implementing automated Performance Testing within the CI/CD pipeline, defining and implementing post-mortem/root-cause analysis processes, and developing improved testing scenarios based on analysis. Performing workload analysis, future growth analysis, and application endurance certification using automated client scripts and performance assurance tools is required. The engineer will implement automated utilities for capturing application transaction traces, sessions, browser network logs, and response times, validating this data against ELK and Dynatrace. Understanding applications and technologies, onboarding new applications by testing and validating them in lower environments, identifying gaps/anomalies, and providing root causes are essential. Collaboration with vendor support for issue resolution or enhancements, and working with application teams for code changes or metric capture are also part of the role. Developing documentation, ensuring systems meet requirements, and participating in issue identification, analysis, and resolution are expected. The role includes building service simulations using Broadcom Dev Test Service Virtualization and Mock Server, creating hypotheses based on production outages, and designing/executing chaos scenarios using Gremlin to test application resiliency. Recommendations based on chaos experiment analysis will be provided. Collaboration with developers, operations, and security teams to understand system architecture and identify weaknesses is crucial. A detailed volumetric analysis and derivation of a Workload Model are also required.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Principal