Associate - Tech Ops Engineering

American ExpressSunrise, FL
33d$78,000 - $124,750

About The Position

At American Express, our culture is built on a 175-year history of innovation, shared values and Leadership Behaviors, and an unwavering commitment to back our customers, communities, and colleagues. As part of Team Amex, you'll experience this powerful backing with comprehensive support for your holistic well-being and many opportunities to learn new skills, develop as a leader, and grow your career. Here, your voice and ideas matter, your work makes an impact, and together, you will help us define the future of American Express. How will you make an impact in this role? We're looking for a Site Reliability /Application Support Engineers responsible for web application performance, availability, and reliability. Candidate is responsible to provide consultation and strategic recommendations by quickly assessing and remediating complex platform availability issues. Site Reliability Engineering /Application Support (SRE/AS) is a continuous engineering discipline that effectively combines software development and systems engineering to build and run scalable, distributed, fault-tolerant systems. This role will ensure that American Express internal and external services have reliability and uptime appropriate to users' needs. We also ensure a continuous improvement, while keeping an ever-watchful eye, automated, on capacity and performance. This role will drive the SRE/AS mindset which strives to use software engineering to build and run better production systems. You will write software to optimize day to day work through better automation, monitoring, alerting, testing and deployment. You'll be expected to work with several Technology partners to identify areas of opportunity within the availability platform and build a solution to automate monitoring solutions for the modernization platform, technology, and constant innovations to drive efficiencies. You will be responsible for implementing tracing, monitoring, tooling solutions to maximize the performance and availability of our Web applications. This is an opportunity to work in one of the best Technology units to help improve customer experience for American Express digital assets and influence how millions of people interact with their cards, their merchants, and their money. The Associate Tech Ops Engineer (SRE/AS Engineer) role is a hands-on position supporting American Express Site Reliability Engineering / Application Support team.

Requirements

  • BS or MS degree in computer science, computer engineering, or other technical discipline, or equivalent 8+ years of work experience in Site Reliability Engineering /Application Support (SRE/AS) supporting Full-stack applications
  • Development or support of Java/J2EE/REACT JS applications, and Node applications
  • Hands on experience with frameworks - Spring Boot, Vertex, NodeJS
  • Experience in designing mission critical highly available enterprise applications
  • Hand on experience with performance testing framework design, tuning Java applications
  • Experience managing relational and NoSQL databases such as DB2, Postgres, Mongo, Couchbase, Cassandra etc.
  • Strong knowledge of Linux internals and experience managing Linux systems in high traffic environments
  • Strong interpersonal communication skills and the ability to work well in a diverse team-focused environment
  • Experience with Splunk and/or ELK. Hands on experience on configuring Splunk, Grafana dashboards, Elastalert, OpenSearch, etc.
  • Good understanding of cloud technologies - Kubernetes, OpenShift, Docker etc.
  • Good understand of GraphQL - Query and resolver
  • Knowledge of Public Cloud technologies GCP, AWS, AZURE etc. would be an advantage
  • Monitoring and analyzing PMI data
  • Hands on experience on enterprise tools set such as Grafana, Dynatrace, AppDynamics, BMC, Prometheus etc.
  • Understanding of using Agile Practices in Operations teams
  • Experience in handling DDoS/BOT attack and different security remediations
  • Working experience with Network load balancers, Global Traffic Managers (GTMs), Local Traffic Managers (LTMs)
  • Working experience on network rules creation, load balancer configurations, network packet analysis
  • Analytical knowledge and exposure on root cause identification using analyzer tools like IBM support assistant, Splunk etc.
  • Open to work in 247 or on-call working environment.

Nice To Haves

  • Knowledge of Public Cloud technologies GCP, AWS, AZURE etc. would be an advantage

Responsibilities

  • Provide hands-on support for the runtime operation of our applications, ensuring high availability and performance.
  • Collaborate with software engineering and infrastructure teams to troubleshoot and resolve runtime issues, including performance bottlenecks, scalability challenges, and system failures.
  • Contribute to the design and implementation of monitoring, alerting, and logging solutions to proactively identify and address potential runtime issues.
  • Participate in incident response and root cause analysis efforts to ensure the stability and resilience of the applications.
  • Work closely with cross-functional teams to understand application requirements and provide input on runtime and operational considerations during the software development lifecycle.
  • Contribute to the development and maintenance of runtime automation and tooling to streamline operational processes and improve efficiency.
  • Mentoring your peers and demonstrate a passion for continuous learning environment for the team.
  • Develop common framework components (to be leveraged by enterprise applications), define standards for configuration, monitoring, reliability, and performance engineering
  • Drive automation and ensure automated test scripts are completed for new features.
  • Good attitude, communication, willingness to learn and collaborate.
  • Bring a culture of innovation, ideas, and continuous improvement.
  • Challenging status quo, demonstrate risk taking, and implement creative ideas.
  • Continuously improve automated remediation tasks to ensure the highest levels of availability.

Benefits

  • For a full list of Team Amex benefits, visit our Colleague Benefits Site.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Entry Level

Industry

Credit Intermediation and Related Activities

Number of Employees

5,001-10,000 employees

© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service