Platform Operations Engineer

DecisionPoint | Cortek
Remote

About The Position

DecisionPoint seeks a Platform Operations Engineer to design, install, and maintain secure, scalable infrastructure supporting a modernized Department of Defense (DoD) enterprise system. The engineer will architect, deploy, and sustain mission-critical platform components that power multi-environment cloud operations across AWS GovCloud IL4/IL5. This position integrates infrastructure engineering, automation, and monitoring best practices to ensure optimal system performance, reliability, and compliance with DoD cybersecurity and availability standards. The engineer will collaborate across disciplines to implement Infrastructure as Code (IaC), configure system baselines, and maintain platform resilience under continuous modernization. This position is fully remote. Note: By applying to this position, you acknowledge and consent to having your resume included in an active competitive government contract bid.

Requirements

  • Must hold an active Top Secret clearance.
  • Bachelor’s degree in Information Systems, Computer Science, or a related technical discipline.
  • Minimum 7 years of experience in systems engineering, platform operations, or infrastructure management within secure federal or defense environments.
  • Proven success deploying and managing AWS GovCloud or DoD IL4/IL5 environments.
  • Experience integrating automation and monitoring solutions in DevSecOps or Agile development contexts.
  • Proficiency in Linux and Windows Server administration, configuration management, and automation.
  • Expertise with AWS services (EC2, RDS, Lambda, S3, VPC, CloudFormation) and hybrid integrations.
  • Familiarity with containerization technologies (Docker, Kubernetes) and orchestration.
  • Knowledge of networking fundamentals, including VPNs, routing, subnets, and firewalls.
  • Experience implementing Infrastructure as Code (IaC) for provisioning and configuration management.
  • Familiarity with monitoring and log aggregation tools such as CloudWatch, Splunk, or ELK.
  • Understanding of RMF, STIGs, Zero Trust, and DoD cybersecurity policies.
  • Experience with backup, recovery, and high-availability architectures.
  • Strong troubleshooting and analytical skills across complex, distributed systems.
  • Ability to balance system performance, security, and maintainability requirements.
  • Excellent communication and collaboration across development, security, and operations teams.
  • Detail-oriented mindset with a commitment to proactive monitoring and preventive maintenance.
  • Dedicated to innovation, continuous improvement, and mission assurance.

Nice To Haves

  • AWS Certified Solutions Architect – Associate or Professional.
  • CompTIA Security+ CE or equivalent DoD 8570 certification.
  • Red Hat Certified Engineer (RHCE) or Microsoft Certified: Azure Administrator Associate.

Responsibilities

  • Design, implement, and maintain platform infrastructure across cloud and on-prem environments, ensuring performance and security requirements are met.
  • Build and manage infrastructure automation pipelines using Terraform, CloudFormation, or Ansible to enable repeatable deployments.
  • Configure and sustain compute, storage, and network components supporting containerized workloads, APIs, and data services.
  • Install and maintain operating systems and middleware components across IL4/IL5 environments in accordance with DoD standards.
  • Develop and maintain system baselines, configuration templates, and IaC repositories for consistent environment replication.
  • Implement continuous monitoring, logging, and alerting solutions (CloudWatch, Grafana, ELK Stack, Prometheus) to ensure system health and uptime.
  • Support DevSecOps pipelines by integrating automated builds, testing, and compliance scanning into deployment workflows.
  • Perform performance tuning and capacity planning for large-scale cloud and hybrid environments.
  • Manage system hardening, patching, and configuration compliance with DISA STIGs and RMF controls.
  • Collaborate with network and cybersecurity teams to design and maintain secure connectivity across IL tiers.
  • Support incident response and root cause analysis for infrastructure-related outages or vulnerabilities.
  • Create and maintain detailed system documentation, architecture diagrams, and operational guides.
  • Participate in Agile ceremonies, providing engineering insight and updates on system readiness and operational metrics.
  • Contribute to Continuous Service Improvement (CSI) efforts by analyzing performance data and implementing preventive measures.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service