About The Position

We are seeking a highly skilled Senior Cloud & Distributed Systems Engineer to support mission-critical, cloud-based data repositories serving users within a dynamic Intelligence Community environment This role operates in a dynamic, high-tempo operational environment where requirements evolve rapidly in response to global events. The selected candidate will administer and engineer large-scale Hadoop and Accumulo clusters, ensure system reliability and security, and collaborate across infrastructure, networking, and security domains to maintain continuous mission availability. This is not a traditional system administration role — it is a reliability-focused distributed systems engineering position operating on a scale. This position will provide after-hours on call/call in support.

Requirements

  • 7+ years of Linux systems administration experience
  • 5+ years scripting (Bash, Python, or Perl)
  • 5 years with BS/BA or 3 years with MS/MA
  • Experience supporting large-scale distributed systems across multiple clusters
  • Hands-on experience with Hadoop, Accumulo, and distributed storage technologies
  • Experience with Kubernetes and Docker
  • Familiarity with automation tools (Puppet, Ansible, Salt) and IaC (Terraform or CloudFormation)
  • Knowledge of monitoring platforms (Prometheus, Grafana, ELK, Splunk)
  • Understanding of networking fundamentals (VLANs, TCP/IP, load balancing)
  • Experience with high-availability design, storage architecture, and disaster recovery
  • System hardening and security compliance experience
  • Active TS/SCI clearance with current polygraph
  • Security+ (DoD 8570 compliant)
  • One of the following: AWS Certified SysOps Administrator – Associate AWS DevOps Engineer – Professional Certified Kubernetes Administrator (CKA)

Responsibilities

  • Administer and optimize enterprise Linux systems across large distributed clusters (60+ nodes, multi-rack deployments)
  • Monitor system health, performance, and reliability; troubleshoot complex hardware, software, network, and cloud issues
  • Support and tune Hadoop (HDFS/YARN) and Accumulo environments
  • Engineer monitoring, observability, and alerting solutions
  • Automate infrastructure using scripting and Infrastructure-as-Code
  • Patch, harden, and secure systems in compliance with security standards
  • Manage LDAP-based user accounts and core Linux services (DHCP, DNS)
  • Support Kubernetes-based containerized workloads
  • Participate in architecture discussions and cross-team engineering efforts
  • Contribute to incident response and root cause analysis

Benefits

  • Peraton offers enhanced benefits to employees working on this critical National Security program, which include heavily subsidized employee benefits coverage for you and your dependents, 25 days of PTO accrued annually up to a generous PTO cap and participation in an attractive bonus plan.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service