About The Position

The Boeing Company is seeking a Machine Learning Infrastructure Engineer (Associate or Experienced) in Huntsville, AL. This Infrastructure Engineer will support our Artificial Intelligence Innovation Lab infrastructure environments with an emphasis on Artificial Intelligence and Machine Learning (AI/ML) development platforms. This position supports Linux and Windows systems, cloud and virtualization platforms, container orchestration technologies, and secure networking services. The selected candidate will work under general direction and collaborate with cross-functional teams including AI/ML engineers, DevOps, cybersecurity, and network engineering. Our teams are currently hiring for a broad range of experience levels including Associate or Experienced Level Software Engineers. Boeing offers a comprehensive benefits package including generous Paid Time Off (PTO), flexible work environment, paid parental leave, Industry-leading retirement benefits with strong matching, very generous tuition assistance for earning advanced degrees, and paid medical leave programs. For more information, click here.

Requirements

  • Bachelor’s degree in an Engineering discipline
  • Ability to obtain a U.S. Secret Security Clearance
  • 1+ years of experience with LINUX and Windows operating systems at the system administrator level
  • 1+ years of experience developing software using Docker or Kubernetes for container-based applications
  • 1+ years of experience with computing networking/storage concepts and architectures
  • 1+ years of experience with Cloud hosted platforms such as AWS, Google Cloud Platform, or Azure

Nice To Haves

  • Bachelor of Science degree in Engineering, Engineering Technology (including Manufacturing Technology), Computer Science, Data Science, Mathematics, Physics, Chemistry
  • 3 or more years' related work experience or an equivalent combination of education and experience
  • Active U.S. Secret Security Clearance
  • 3+ years of experience supporting lab infrastructure environments
  • Experience or exposure to virtualization technologies such as VMware, Hyper-V, Proxmox, or similar
  • Basic experience with automation or scripting tools (Python, Bash, PowerShell, Ansible, Terraform, or similar)
  • Familiarity with CI/CD concepts and tools, preferably GitLab
  • Understanding of GPU compute concepts or high-performance workloads
  • Experience working with Kubernetes clusters in production or lab environments
  • Familiarity with distributed storage systems and storage networking
  • Knowledge of security best practices and compliance frameworks such as NIST
  • Experience supporting AI/ML or high-performance computing (HPC) environments
  • Experience troubleshooting VPN and complex network connectivity issues
  • Relevant certifications such as RHCE, Microsoft Certified: Azure Administrator, Cisco CCNA, or equivalent
  • Strong problem-solving, communication, and collaboration skills

Responsibilities

  • Performs Linux and Windows system administration tasks including system monitoring, patching, updates, and routine maintenance.
  • Supports compliance with enterprise IT policies, cybersecurity standards, and regulatory requirements.
  • Assists with deployment and support of MLOps tooling used to manage GPU compute resources for AI/ML workloads.
  • Supports management and operation of compute infrastructure used by AI and ML development teams.
  • Assists in configuring and maintaining network devices (firewalls, switches) to ensure secure and reliable operations.
  • Troubleshoots network connectivity issues, including VPN access, escalating complex issues to senior engineers as required.
  • Assists in optimizing cloud infrastructure resources to improve performance, cost efficiency, and scalability.
  • Supports virtualization platforms and cluster technologies, ensuring availability and performance.
  • Assists with administration of distributed storage and storage networking systems.
  • Supports Kubernetes cluster operations using platforms such as Rancher and OpenShift, ensuring cluster health and security.
  • Uses Kubernetes tooling to assist with containerized application deployment and orchestration.
  • Assists in administration of core network services, including DNS, DHCP, and LDAP across Linux and Windows environments.
  • Documents infrastructure changes and asset data in Data Center Infrastructure Management (DCIM) and IP Address Management (IPAM) systems.
  • Assists with deployment and management of local generative AI models, supporting infrastructure optimization for AI workloads.
  • Collaborates with DevOps teams to develop, support, and troubleshoot CI/CD pipelines, primarily using GitLab.
  • Assists with automation of infrastructure provisioning and deployment workflows using scripting and infrastructure-as-code tools.

Benefits

  • generous Paid Time Off (PTO)
  • flexible work environment
  • paid parental leave
  • Industry-leading retirement benefits with strong matching
  • very generous tuition assistance for earning advanced degrees
  • paid medical leave programs
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service