Application Reliability & Support Engineer

CACI•Fort Meade, MD

3h•Onsite

About The Position

Join our dynamic team supporting the Secure the Enterprise initiative by helping transform today’s manual system security evaluation and authorization processes into a modern, automated, and continuously monitored environment. In this role, you’ll work across multiple technical domains—software development, systems engineering, systems administration, and operations support—to keep critical applications running smoothly and securely throughout their entire lifecycle. This is a hands-on position at the heart of mission operations, offering a fast-paced environment, broad technical exposure, and the opportunity to directly influence enterprise readiness and resiliency.

Requirements

Active TS/SCI w/ Polygraph
All experience and education must be technical and directly related to the role.
All education must come from accredited institutions.
10 years of experience with a High School Diploma/GED
8 years of experience with an Associate’s degree
6 years of experience with a Bachelor’s degree
4 years of experience with a Master’s degree
2 years of experience with a Doctorate
Candidates must be proficient in one or more of the following areas: Monitoring & Watchfloor Operations: Grafana, Kibana, Splunk, CloudWatch, or comparable tools; experience with dashboard monitoring, alert triage, log analysis, incident escalation, and shift turnover.
Linux / Systems Administration Comfort operating in Linux environments, including conducting system health checks, managing services, reviewing logs, and safely rebooting servers.
Cloud / AWS Operations Experience with C2E, HCI, AWS, Lambda, AWS Console, or AWS CLI—particularly in supporting EC2 instances, performing health checks, and basic operational troubleshooting.
Dataflow / Application Support Familiarity with data pipeline tools such as Apache NiFi, Cribl, Kafka, or Logstash; experience monitoring job health, queues, ingestion flows, and data movement.
DevOps Tools Exposure to Terraform, Ansible, GitLab Pipelines, or Docker.

Nice To Haves

Experience working in mission operations or 24/7 support environment
Familiarity with automated security assessment workflow
Strong analytical troubleshooting skills and the ability to work independently in high-tempo situation
Excellent communication skills for shift turnovers, incident reporting, and cross-team coordination

Responsibilities

Monitor and maintain application systems to ensure data accuracy, reliability, and overall system health.
Drive operational excellence by identifying performance issues, troubleshooting problems, and implementing improvements to system configuration or code when needed.
Track and interpret key metrics, logs, dashboards, and alerts to proactively prevent downtime and maintain optimal performance.
Support 24/7 watchfloor operations by detecting service degradations, dataflow failures, system errors, and application issues—and escalate when required
Perform foundational Linux administration tasks, including service checks, restarts, log reviews, and resource validation (disk, memory, CPU), as well as supporting server reboots.
Assist with cloud operations in AWS using the Console or CLI to assess instance health, review system status, reboot servers, and provide basic operational troubleshooting.
Participate in a rotating 12‑hour shift schedule (6AM–6PM / 6PM–6AM) to deliver continuous application support.

Benefits

flexible time off
robust learning resources
comprehensive benefits such as; healthcare, wellness, financial, retirement, family support, continuing education, and time off benefits.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume