Lead Systems Administration

AT&TPlano, TX
Onsite

About The Position

The Lead Systems Administration role involves performing the deployment of software updates on a cloud platform, ensuring continuous service availability, troubleshooting production issues, and providing incident response. The position requires proactive development and implementation of monitoring systems, maintenance of tools, scripting, and automation for configuration management, maintenance, testing, auditing, problem remediation, and capacity planning. Collaboration with vendors to resolve defects and drive enhancements is also a key responsibility. The role includes conducting hardware and software audits, managing backups, developing standard operating procedures, performing security compliance and auditing, and conducting Operational Readiness and Acceptance Testing. Additionally, the position provides integration and advisement to tenants, supports systems before launch through collaboration with various teams, and performs feasibility assessments, requirements creation, project management, and technical solution integration and testing.

Requirements

  • Requires a Bachelor’s degree, or foreign equivalent degree in Computer Science, Electronics, Information Technology, Engineering or Communications
  • 5 Years of progressive, postbaccalaureate experience in the job offered or 5 Years of progressive, postbaccalaureate experience in a related occupation utilizing Linux, KVM, OpenStack, Kubernetes, Containers, Cloud Infrastructure platforms
  • Monitoring of the server Infrastructure
  • Trouble-shooting Systems Administration of Linux, Cloud, OpenStack and Kubernetes environment
  • Working on operations and maintenance of OpenStack modules like Keystone, Nova, Neutron, Swift, Cinder, Heat, Glance, Horizon, and Fuel
  • Implementing and maintaining High Availability, DRS, Fault Tolerance, Scalability, and Reliability
  • Utilizing CI/CD Pipeline in Jenkins and GitHub
  • Creating, reviewing, approving and implementing changes in NC server environment
  • Utilizing Python and shell scripting
  • Repairing Dell and HP x86-based hardware

Responsibilities

  • Perform deployment of software updates on a cloud platform comprised of Linux OS, BIOS/firmware, OpenStack, Kubernetes, Calico, Ceph, Maria DB and other software components.
  • Ensure continual service availability, troubleshooting and resolution of production problems.
  • Provide incident response, management and root cause analysis.
  • Perform proactive development and implementation of monitoring systems.
  • Maintain and continuously improve tools, ad-hoc scripting and automation infrastructure for configuration management, maintenance, testing, auditing, problem remediation and capacity planning.
  • Collaborate with and manage vendors to drive defect resolution and enhancements for internal customers to meet departmental or enterprise business needs, in addition to providing extended support as needed.
  • Conduct routine hardware and software audits of servers to ensure compliance with established standards, policies, and configuration guidelines.
  • Perform the setup, maintenance, and monitoring of backups of supported infrastructure.
  • Develop, promote, and curate standard operating and method of procedures in conjunction with risk assessments to reduce hazards to the network.
  • Perform security compliance, auditing and remediation based on AT&T security policies in collaboration with CSO.
  • Perform Operational Readiness Testing and Operational Acceptance Testing.
  • Provide integration and advisement to tenants.
  • Provide support of systems before launch through collaboration with TechArch, Labs, application teams, and vendors by providing system design/architecture guidance, system review and testing to ensure the adherence to AT&T requirements and fulfillment of AT&T needs.
  • Perform feasibility assessments, creates requirements, manages projects, and integrates and tests technical solutions for software.

Benefits

  • Medical/Dental/Vision coverage
  • 401(k) plan
  • Tuition reimbursement program
  • Paid Time Off and Holidays (based on date of hire, at least 23 days of vacation each year and 9 company-designated holidays)
  • Paid Parental Leave
  • Paid Caregiver Leave
  • Additional sick leave beyond what state and local law require may be available but is unprotected
  • Adoption Reimbursement
  • Disability Benefits (short term and long term)
  • Life and Accidental Death Insurance
  • Supplemental benefit programs: critical illness/accident hospital indemnity/group legal
  • Employee Assistance Programs (EAP)
  • Extensive employee wellness programs
  • Employee discounts up to 50% off on eligible AT&T mobility plans and accessories, AT&T internet (and fiber where available) and AT&T phone
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service