Lead Systems Administration

AT&TPlano, TX
Onsite

About The Position

The Lead Systems Administration role involves performing the deployment of software updates on a cloud platform consisting of Linux OS, BIOS/firmware, OpenStack, Kubernetes, Calico, Ceph, Maria DB, and other software components. The position ensures continual service availability, troubleshoots and resolves production problems, and provides incident response, management, and root cause analysis. Responsibilities include proactive development and implementation of monitoring systems, maintaining and improving tools, scripting, and automation infrastructure for configuration management, maintenance, testing, auditing, problem remediation, and capacity planning. The role also involves collaborating with and managing vendors to drive defect resolution and enhancements, conducting routine hardware and software audits, performing setup, maintenance, and monitoring of backups, developing standard operating procedures, performing security compliance and remediation, and conducting Operational Readiness Testing and Operational Acceptance Testing. Additionally, the role provides integration and advisement to tenants, supports systems before launch through collaboration with various teams, and performs feasibility assessments, requirements creation, project management, and technical solution integration and testing.

Requirements

  • Requires a Bachelor’s degree, or foreign equivalent degree in Computer Science, Electronics, Information Technology, Engineering or Communications
  • 5 Years of progressive, postbaccalaureate experience in the job offered or 5 Years of progressive, postbaccalaureate experience in a related occupation utilizing Linux, KVM, OpenStack, Kubernetes, Containers, Cloud Infrastructure platforms
  • Monitoring of the server Infrastructure
  • Trouble-shooting Systems Administration of Linux, Cloud, OpenStack and Kubernetes environment
  • Working on operations and maintenance of OpenStack modules like Keystone, Nova, Neutron, Swift, Cinder, Heat, Glance, Horizon, and Fuel
  • Implementing and maintaining High Availability, DRS, Fault Tolerance, Scalability, and Reliability
  • Utilizing CI/CD Pipeline in Jenkins and GitHub
  • Creating, reviewing, approving and implementing changes in NC server environment
  • Utilizing Python and shell scripting
  • Repairing Dell and HP x86-based hardware

Responsibilities

  • Perform deployment of software updates on a cloud platform comprised of Linux OS, BIOS/firmware, OpenStack, Kubernetes, Calico, Ceph, Maria DB and other software components.
  • Ensure continual service availability, troubleshooting and resolution of production problems.
  • Provide incident response, management and root cause analysis.
  • Perform proactive development and implementation of monitoring systems.
  • Maintain and continuously improve tools, ad-hoc scripting and automation infrastructure for configuration management, maintenance, testing, auditing, problem remediation and capacity planning.
  • Collaborate with and manage vendors to drive defect resolution and enhancements for internal customers to meet departmental or enterprise business needs, in addition to providing extended support as needed.
  • Conduct routine hardware and software audits of servers to ensure compliance with established standards, policies, and configuration guidelines.
  • Perform the setup, maintenance, and monitoring of backups of supported infrastructure.
  • Develop, promote, and curate standard operating and method of procedures in conjunction with risk assessments to reduce hazards to the network.
  • Perform security compliance, auditing and remediation based on AT&T security policies in collaboration with CSO.
  • Perform Operational Readiness Testing and Operational Acceptance Testing.
  • Provide integration and advisement to tenants.
  • Provide support of systems before launch through collaboration with TechArch, Labs, application teams, and vendors by providing system design/architecture guidance, system review and testing to ensure the adherence to AT&T requirements and fulfillment of AT&T needs.
  • Perform feasibility assessments, creates requirements, manages projects, and integrates and tests technical solutions for software.

Benefits

  • Medical/Dental/Vision coverage
  • 401(k) plan
  • Tuition reimbursement program
  • Paid Time Off and Holidays (based on date of hire, at least 23 days of vacation each year and 9 company-designated holidays)
  • Paid Parental Leave
  • Paid Caregiver Leave
  • Additional sick leave beyond what state and local law require may be available but is unprotected
  • Adoption Reimbursement
  • Disability Benefits (short term and long term)
  • Life and Accidental Death Insurance
  • Supplemental benefit programs: critical illness/accident hospital indemnity/group legal
  • Employee Assistance Programs (EAP)
  • Extensive employee wellness programs
  • Employee discounts up to 50% off on eligible AT&T mobility plans and accessories, AT&T internet (and fiber where available) and AT&T phone
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service