Lead Infrastructure Engineer

TruistAtlanta, GA
1dOnsite

About The Position

The Lead Infrastructure Engineer in this role will be expected to resolve complex technical issues for Truist's Network applications and the following: 1. System Monitoring & Analysis - Continuously monitor network, server, and storage utilization to identify potential bottlenecks, performance issues, and resource shortages. 2. Capacity Planning - Forecast future resource needs based on business goals and growth, ensuring the infrastructure can handle increasing demands. 3. Performance Optimization - Tune servers, networks, and applications to maximize efficiency and ensure optimal resource allocation. 4. Scalability & Design - Design and implement scalable IT solutions, including public cloud infrastructure, to accommodate future growth. 5. Troubleshooting & Support -Diagnose and resolve complex infrastructure issues, provide technical expertise, and serve as an escalation point for capacity problems. 6. System Administration - Manage and configure physical and virtual servers, data storage, and network devices to meet capacity standards. 7. Public Cloud Expertise - Design, implement, and manage for capacity & utilization cloud-based infrastructure solutions and services. 8. Documentation - Create and maintain detailed documentation for infrastructure systems, processes, and changes. 9. Reporting - Provide assessments and reports on system performance, infrastructure health, and capacity plans to management. TECHNICAL SKILLS: 1. Distributed Systems - Ability to operate, evaluate, manage large-scale, distributed systems while understanding interdependencies of competing constraints between various technologies. 2. Virtualization - Strong experience with virtualization including hyperconverged technologies, VMWare, Nutanix with emphasis on performance utilization. 3. Containerization - Experience with tools such as Kubernetes, Docker, OpenShift. 4. Public Cloud Platforms - In-depth knowledge of major cloud providers and service offering from AWS, Azure, and Google Cloud. 5. Networking - Proficiency in performance and utilization aspects of network operations including routing, switching, load balancing, firewalls and edge networks. 6. Databases - solid experience with large-scale database systems including MSSQL, Oracle / Exadata, Postgres, Mongo. 7. Capacity Modeling - Ability to design and implement scalable capacity models and forecasting tools for computer, storage, and network infrastructure. 8. Data Analysis - Strong ability to analyze large datasets to generate insights and support decision-making. 9. Forecasting - Proven experience in forecasting and supply-demand matching for large-scale environments. 10. Performance Tuning - Knowledge of system and application performance optimization techniques. 11. Monitoring & Observability -Deep understanding of infrastructure monitoring and performance optimization tools. 12. Scripting Languages - Proficiency in Python, Go, Bash, or PowerShell for automating tasks and developing tools. Please note that candidate must be located in or willing to self-relocate to one of the following locations: Charlotte, Raleigh or Wilson, NC; Atlanta, GA or Richmond, VA. For this opportunity, Truist will not sponsor an applicant for work visa status or employment authorization, nor will we offer any immigration-related support for this position (including, but not limited to H-1B, F-1 OPT, F-1 STEM OPT, F-1 CPT, J-1, TN-1 or TN-2, E-3, O-1, or future sponsorship for U.S. lawful permanent residence status.) • Charlotte, NC • Raleigh, NC • Winston Salem, NC • Atlanta, GA Truist has 'in office' requirements that must be honored.

Requirements

  • Bachelor's degree and five years of experience in development or application support or an equivalent combination of education and work experience.
  • In- depth knowledge in information systems and ability to identify, apply, and implement best practices.
  • Understanding of key business processes and competitive strategies related to the IT function.
  • Ability to plan and manage projects.
  • Ability to solve complex problems by applying best practices.
  • Ability to provide direction and mentor less experienced teammates.
  • Ability to interpret and convey complex, difficult, or sensitive information.
  • Ability to operate, evaluate, manage large-scale, distributed systems while understanding interdependencies of competing constraints between various technologies.
  • Strong experience with virtualization including hyperconverged technologies, VMWare, Nutanix with emphasis on performance utilization.
  • Experience with tools such as Kubernetes, Docker, OpenShift.
  • In-depth knowledge of major cloud providers and service offering from AWS, Azure, and Google Cloud.
  • Proficiency in performance and utilization aspects of network operations including routing, switching, load balancing, firewalls and edge networks.
  • Solid experience with large-scale database systems including MSSQL, Oracle / Exadata, Postgres, Mongo.
  • Ability to design and implement scalable capacity models and forecasting tools for computer, storage, and network infrastructure.
  • Strong ability to analyze large datasets to generate insights and support decision-making.
  • Proven experience in forecasting and supply-demand matching for large-scale environments.
  • Knowledge of system and application performance optimization techniques.
  • Deep understanding of infrastructure monitoring and performance optimization tools.
  • Proficiency in Python, Go, Bash, or PowerShell for automating tasks and developing tools.

Nice To Haves

  • Bachelor's degree and six years of experience or an equivalent combination of education and work experience.
  • Banking or financial services experience.
  • Communication - Excellent written and verbal communication skills for planning and troubleshooting with diverse teams and stakeholders.
  • Collaboration - Experience working effectively with engineering, operations, business teams, and vendors.
  • Problem-Solving - Strong analytical and problem-solving skills to make data-driven decisions under uncertainty.

Responsibilities

  • Performs problem tracking, diagnosis and root-cause analysis, replication, troubleshooting, and resolution for complex issues. In this capacity, performs programming and debugging activities.
  • Responds to issues in a timely manner by receiving and investigating incidents or service tickets.
  • Analyzes and observes trends with technical issues and develops recommendations for long- term improvements.
  • Documents all relevant end-user interactions and steps taken to resolve incidents.
  • Has occasional contact with end-users.
  • Communicates status of issue resolution to internal customers.
  • May engage and manage outside vendors.
  • Applies in-depth knowledge of application support and an understanding of best practices.
  • Typically leads moderately complex projects and participates in larger, more complex initiatives.
  • Solves complex technical and operational problems.
  • Acts as a resource for teammates with less experience.
  • May have people management responsibilities for a small team.

Benefits

  • Truist offers medical, dental, vision, life insurance, disability, accidental death and dismemberment, tax-preferred savings accounts, and a 401k plan to teammates.
  • Teammates also receive no less than 10 days of vacation (prorated based on date of hire and by full-time or part-time status) during their first year of employment, along with 10 sick days (also prorated), and paid holidays.
  • Depending on the position and division, this job may also be eligible for Truist’s defined benefit pension plan, restricted stock units, and/or a deferred compensation plan.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service