Obsidian Solutions Group-posted 6 months ago
Full-time • Mid Level
Hybrid • Quantico, VA
Administrative and Support Services

We are seeking a Site Reliability Engineer - Journeyman to support the performance, reliability, and security of enterprise infrastructure in a mission-critical environment. This role focuses on automation, monitoring, and incident response across production and support tiers. The ideal candidate will bring strong scripting skills, cloud experience, and a proactive approach to system optimization and resilience.

  • Monitor and maintain infrastructure performance and availability across production and support environments
  • Implement and manage CI/CD pipelines and automation for system monitoring, patching, and recovery
  • Troubleshoot service disruptions, perform root cause analysis, and implement corrective actions
  • Collaborate with operations, engineering, and cybersecurity teams to ensure secure and scalable system operations
  • Support capacity planning, performance tuning, and system optimization initiatives
  • Document system events, changes, and performance metrics in accordance with operational standards
  • Support interface monitoring and data integrity across integrated systems
  • Maintain audit readiness and compliance with operational standards
  • Bachelor's degree in Computer Science, Information Systems, or a related technical field
  • 4-6 years of experience in systems or network engineering, DevOps, or site reliability engineering
  • Experience managing and maintaining enterprise infrastructure performance, reliability, and security
  • Proficiency with CI/CD tools and automation frameworks (e.g., Jenkins, GitLab CI, Ansible, Terraform)
  • Strong scripting skills in languages such as Python, Bash, or PowerShell
  • Experience with monitoring, alerting, and log analysis tools (e.g., Splunk, Prometheus, Grafana)
  • Familiarity with cloud and virtualization technologies (e.g., AWS, OCI, VMware)
  • Understanding of disaster recovery, backup/restore, and incident response procedures
  • Active DoD Secret Clearance or the ability to obtain one
  • Experience supporting secure federal or defense-related IT environments
  • Familiarity with Oracle Exadata, ZFS, and InfiniBand networking
  • Experience with performance tuning and root cause analysis (RCA) in mission-critical systems
  • Knowledge of IAVA compliance, STIGs, and DoD cybersecurity standards
  • Exposure to Agile/SAFe environments and participation in sprint planning and retrospectives
  • Experience supporting interface monitoring and data integrity across integrated systems
  • Competitive compensation package
  • Exceptional benefits that protect the well-being of employees and their families
  • Family atmosphere with a commitment to operational excellence
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service