About The Position

We are seeking a highly skilled Senior Cloud & Distributed Systems Engineer to support mission-critical, cloud-based data repositories serving users within a dynamic Intelligence Community environment. This role operates in a dynamic, high-tempo operational environment where requirements evolve rapidly in response to global events. The selected candidate will administer and engineer large-scale Hadoop and Accumulo clusters, ensure system reliability and security, and collaborate across infrastructure, networking, and security domains to maintain continuous mission availability. This is not a traditional system administration role — it is a reliability-focused distributed systems engineering position operating on a scale. This position will provide after-hours on call/call in support.

Requirements

  • 7+ years of Linux systems administration experience
  • 5+ years scripting (Bash, Python, or Perl)
  • 5 years with BS/BA or 3 years with MS/MA
  • Experience supporting large-scale distributed systems across multiple clusters
  • Hands-on experience with Hadoop, Accumulo, and distributed storage technologies
  • Experience with Kubernetes and Docker
  • Familiarity with automation tools (Puppet, Ansible, Salt) and IaC (Terraform or CloudFormation)
  • Knowledge of monitoring platforms (Prometheus, Grafana, ELK, Splunk)
  • Understanding of networking fundamentals (VLANs, TCP/IP, load balancing)
  • Experience with high-availability design, storage architecture, and disaster recovery
  • System hardening and security compliance experience
  • Active TS/SCI clearance with current polygraph
  • Security+ (DoD 8570 compliant)
  • One of the following: AWS Certified SysOps Administrator – Associate, AWS DevOps Engineer – Professional, Certified Kubernetes Administrator (CKA)

Responsibilities

  • Administer and optimize enterprise Linux systems across large distributed clusters (60+ nodes, multi-rack deployments)
  • Monitor system health, performance, and reliability; troubleshoot complex hardware, software, network, and cloud issues
  • Support and tune Hadoop (HDFS/YARN) and Accumulo environments
  • Engineer monitoring, observability, and alerting solutions
  • Automate infrastructure using scripting and Infrastructure-as-Code
  • Patch, harden, and secure systems in compliance with security standards
  • Manage LDAP-based user accounts and core Linux services (DHCP, DNS)
  • Support Kubernetes-based containerized workloads
  • Participate in architecture discussions and cross-team engineering efforts
  • Contribute to incident response and root cause analysis

Benefits

  • Heavily subsidized employee benefits coverage for you and your dependents
  • 25 days of PTO accrued annually up to a generous PTO cap
  • Participation in an attractive bonus plan
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service