Platform Engineeer

CenturiaGreenbelt, MD
5d

About The Position

The Space Science Mission Operations group at NASA Goddard runs the ground software that keeps satellites alive and science data moving. A lot of that software is old. It's installed by hand, configured by hand, and understood by a shrinking number of people. We're moving it to Kubernetes, on-premise, and we need someone who can lead that work without losing the mission operators in the process.

Requirements

  • Bachelor’s degree in computer science, Engineering, Technology, or a related field required.
  • 7+ years of experience working with AWS, RHEL, Windows Servers and Active Directory (AD) required.
  • Experience operating Kubernetes clusters
  • Strong interpersonal, written, and oral communication skills.
  • Strong time-management and priority management skills
  • Strong RHEL/Linux/Unix background
  • Scripting experience (Shell/Perl/Python, or related language)
  • Strong understanding of network protocols and principles (OSI network layers, TCP/IP)
  • Experience with designing/managing virtual infrastructures (VMWare/Nutanix/Hyper-V)
  • Strong reasoning, problem solving, and troubleshooting skills.
  • Expertise in creating, analyzing, and maintaining large-scale infrastructure deployments.
  • Experience with private-cloud backup and recovery

Responsibilities

  • As a System Engineer, you will be analyzing, configuring, installing, upgrading, monitoring, maintaining, troubleshooting, patching, compiling, securing, and repairing Linux servers in a heterogeneous environment.
  • Ability to support AWS with services such as Cloud Formation Templates, S3, EC2, EKS
  • Provide technical guidance and maintain operations standards.
  • Participate in the validation matrix role to independently test and validate solutions to ensure fulfillment of all business requirements, compliance of regulations and quality of solutions.
  • Build, deploy, manage, and maintain new and existing infrastructure systems.
  • Monitor and analyze data center infrastructure resources and system applications.
  • Implement and maintain backups redundancy strategies.
  • Responsible for the security of the managed infrastructure systems
  • Responsible for collaborating and defining policies, procedures, and technical documentation including reports, design, maintenance, operational, and support procedures.
  • Responsible for the on-going development of automation strategies
  • Engage with upper management and stakeholders with requirements, status reports, activity summaries, and achievements.
  • Communicate with stakeholders to identify their needs and requirements.
  • Use expert-level administration and optimization of hosts and servers to ensure high availability and resource management.
  • Standardize and automate processes and monitoring using scripting and advance technologies offered by the data center.
  • Install and configure operating systems, software, and hardware components, and clearly document the design, maintenance, and support procedures for routine tasks to leverage IT support staff.
  • Routinely test software for bugs, redundancies, and security issues.
  • Conduct high-level root-cause analysis for service interruption and establish preventive measures.
  • Create reports and documentation outlining findings and solutions; oversee the overall backup strategy and daily operations for secure backups and restore testing.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service