Lockheed Martin-posted 2 months ago
$131,000 - $227,125/Yr
Full-time • Mid Level
Remote • Fort Worth, TX
5,001-10,000 employees
Transportation Equipment Manufacturing

When it comes to using cutting-edge machine learning to tackle complex problems, Lockheed Martin is driven by a singular mission focus and desire to continuously innovate! Today's challenges to global security aren't just changing - they're accelerating faster than ever before. Through our dedication to our mission, our AI-enabled systems are changing the way militaries operate and protect their forces, the way first responders fight fires, and how researchers explore the far reaches of space and the ocean's depths. The Lockheed Martin Artificial Intelligence Center (LAIC) team is seeking a mid-career AI Platform Ops & Deployments Engineer for deployment of machine learning workload environments and tooling. As a Platform Ops & Deployment Engineer at Lockheed Martin, you'll have the opportunity to drive key initiatives and contribute to the growth and development of our AI Factory platform. With a focus on stability, reliability, and innovation, you'll play a critical role in shaping the future of AI/ML computing resources and services.

  • Deploying tooling in customer and disconnected environments and fostering relationships with customer system developers, ensuring seamless integration and functionality
  • Collaborating with stakeholders to identify and prioritize requirements for AI system improvements and new feature development
  • Optimizing infrastructure as code and software bills of materials
  • Providing deployment expertise across the technology stack
  • Addressing user needs and resolving system reliability and stability concerns
  • Stay up-to-date with emerging AI technologies and trends, applying this knowledge to improve AI Factory toolsets and systems
  • Experience with Kubernetes, including distributions such as OpenShift, Rancher or GKE
  • Experience with Linux, including distributions such as RHEL (Red Hat Enterprise Linux), Debian, or UNIX
  • Experience with public cloud computing services, such as AWS, GCP, or Azure
  • Familiarity with diverse IT principles, such as Networking, Storage, Computing, Security, or Distributed Services
  • Familiarity with programming and scripting, such as Python, Go, or Bash
  • US Citizen is required due to system access
  • Experience with DevSecOps software development practices
  • Experience with Collaboration Tools, such as Slack, Confluence, and Gitlab
  • Familiarity with computing design and operations
  • Familiarity with Container Storage, including Container Storage Interfaces (CSI) and Persistent Volumes
  • Familiarity with Pipeline and GitOps Automation, such as ArgoCD, Tekton, Gitlab CI/CD
  • Familiarity with Kubernetes Automation, such as Helm or Kustomize
  • Familiarity with Infrastructure Automation, such as Ansible or Terraform
  • Knowledge of Server Room design, including server hardware, rack diagrams, power, and cooling requirements
  • Knowledge of Monitoring and Performance, such as Prometheus, Grafana, and Thanos
  • Knowledge of Container Security, such as Falco, Sysdig Secure, Aqua, or Anchore
  • Knowledge of Image Registries, such as Quay or Harbor
  • Knowledge of Storage, such as Ceph, NetApp, Object/AWS S3
  • Knowledge of Machine Learning Architectures, including GPU Computing, High Performance Computing (HPC)
  • Knowledge of AI/ML Orchestration tools, such as Kubeflow or OpenDataHub
  • Medical
  • Dental
  • Vision
  • Life Insurance
  • Short-Term Disability
  • Long-Term Disability
  • 401(k) match
  • Flexible Spending Accounts
  • EAP
  • Education Assistance
  • Parental Leave
  • Paid time off
  • Holidays
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service