Senior SRE Engineer

Infosys•Richardson, TX

About The Position

Infosys is seeking a Senior SRE Engineer- In the role, you will as a Tier 2/Site Reliability Engineer (SRE), you will translate core business requirements into robust, scalable, and reliable technical solutions. You will play a pivotal role in designing and implementing applications, platforms, and services that power critical business operations, with a strong emphasis on high availability, performance, and compliance in cloud, messaging, and data environments. You will have to demonstrate the experience & skills that includes a hybrid of traditional T2/SRE operations technical skills to support Project Growth apps and new, evolving Generative AI and Workflow Automation skillsets needed to drive operational efficiency and scalability. You will have to have strong knowledge of network and telecom standards (3GPP, TM Forum, etc.) and have practical understanding of AI/ML concepts and their integration in enterprise platforms.

Requirements

Bachelor’s degree or foreign equivalent required from an accredited institution. Will also consider three years of progressive experience in the specialty in lieu of every year of education.
At least 4 years of Information Technology experience.
Candidate must be located within commuting distance of Richardson, TX or willing to relocate to the area.
This position may require travel to project locations.
Candidates authorized to work for any employer in the United States without employer-based visa sponsorship are welcome to apply.
Infosys is unable to provide immigration sponsorship for this role currently.
Experience in SRE / Tier 2 with handson experience in Openet / Mediation / Charging

Nice To Haves

As a Tier 2 / SRE Sr engineer should have ability to drive below tasks

Responsibilities

Ensuring applications and systems are highly reliable, scalable, and performant while fostering a collaborative culture between development and operations.
Work with T1 team on incident as Triage lead during outages or critical issues
Pager duty issues
Minimize downtime and user impact during incidents.
Conduct detailed After Action Reviews involving all stakeholders and chalk out short term and long-term resiliency options.
Eliminate recurrence of similar issues through systemic fixes.
Define and implement monitoring and alerting strategies tailored to the launch.
Collaborate with Product development teams to gain deep insight into the application architecture, flows and critical dependencies.
Monitor and evaluate key performance metrics like latency, throughput, error rates and update alerts
Propose architectural or operational changes to prevent recurrence
Reduce Mean Time to Resolution (MTTR) for incidents.

Benefits

Medical/Dental/Vision/Life Insurance
Long-term/Short-term Disability
Health and Dependent Care Reimbursement Accounts Insurance (Accident, Critical Illness , Hospital Indemnity, Legal)
401(k) plan and contributions dependent on salary level
Paid holidays plus Paid Time Off

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume