About The Position

Lead AWS networking architecture, application load balancing, and enterprise monitoring/observability implementation for a Federal cloud integration solution. This is for a polypharmacy solution in a complex, multi-system cloud integration solution for Department of Veterans Affairs healthcare system that services millions of veterans. Responsible for designing secure network segmentation, configuring high-availability load balancing, and establishing comprehensive monitoring across Splunk, Dynatrace, and DataDog platforms to ensure visibility, compliance, and operational excellence for RMF/ATO approval. Due to the nature of our work as a federal consulting organization, employees may be expected to handle Controlled Unclassified Information (CUI) and must adhere to applicable safeguarding and compliance requirements.

Requirements

  • Bachelor's degree in Computer Science, Information Systems, Information Technology, or related technical field (relevant certifications and experience may supplement)
  • 5-7 years in network engineering, systems engineering, or DevOps roles
  • 3+ years hands-on experience with AWS networking and load balancing in production environments
  • 2+ years experience with enterprise monitoring platforms (Splunk, Dynatrace, DataDog, or similar)
  • Strong analytical and troubleshooting skills for network and performance issues
  • Attention to detail and commitment to security-first network design
  • Effective written and verbal communication for documentation and cross-team collaboration
  • Ability to work independently and manage concurrent networking and monitoring workstreams
  • Adaptable to fast-paced, deadline-driven environment with changing requirements
  • Proactive identification of network bottlenecks and monitoring gaps
  • Understanding of NIST 800-53 security controls for networking and monitoring

Nice To Haves

  • AWS Certified Advanced Networking - Specialty or Solutions Architect - Professional
  • Splunk Certified Admin, Dynatrace Professional, or DataDog certification
  • Experience with AWS WAF, Shield, and security monitoring services
  • Knowledge of service mesh technologies (Istio, AWS App Mesh)
  • Federal government contracting or DoD networking experience
  • Experience with network automation using Terraform or CloudFormation
  • Experience in federal, DoD, or highly regulated environments
  • Prior involvement in RMF/ATO processes with network security control implementation

Responsibilities

  • Design and implement AWS networking architecture including VPC design, subnets, route tables, security groups, and NACLs
  • Configure Application Load Balancer (ALB) with target groups, health checks, SSL/TLS termination, path-based routing, and WAF integration
  • Implement network security controls for federal compliance including network segmentation, encryption in transit, and zero-trust principles
  • Design multi-AZ high availability architecture ensuring resilience during infrastructure failures
  • Understand and coordinate Transit Gateway, PrivateLink, and VPC peering for secure multi-system connectivity
  • Implement container networking including service discovery, ingress controllers, and network policies
  • Manage VPC Flow Logs and network traffic analysis for security monitoring and troubleshooting
  • Create network diagrams, boundary protection documentation, and data flow diagrams for RMF compliance
  • Implement and configure enterprise monitoring platforms (Splunk, Dynatrace, and/or DataDog) for comprehensive system visibility
  • Design monitoring architecture covering containers, load balancers, APIs, databases, and data pipelines
  • Configure audit logging and SIEM integration for federal compliance requirements including who-did-what-when traceability
  • Establish alert design, escalation policies, and incident response integration for operational excellence
  • Create dashboards for technical teams, operations, and compliance stakeholders
  • Integrate AWS CloudWatch, CloudTrail, and VPC Flow Logs with enterprise monitoring platforms
  • Implement performance monitoring, capacity planning, and baseline establishment for anomaly detection
  • Configure distributed tracing and application performance monitoring (APM) for multi-tier applications
  • Design network architecture supporting zero-downtime deployments and automatic failover
  • Configure load balancer health checks, connection draining, and traffic distribution algorithms
  • Implement DNS failover strategies and multi-region considerations for disaster recovery
  • Test and validate network failover scenarios and recovery procedures
  • Monitor network performance metrics and optimize for latency, throughput, and reliability
  • Implement network security controls aligned with NIST 800-53 requirements
  • Configure encryption in transit (TLS 1.2+) across all network communication paths
  • Apply least-privilege network access policies using security groups and NACLs
  • Implement network intrusion detection and prevention monitoring
  • Document network security controls and monitoring capabilities for RMF/ATO security assessment
  • Configure compliance logging with appropriate retention policies for audit requirements
  • Monitor and alert on security events, anomalous network traffic, and compliance violations
  • Create comprehensive network architecture diagrams, IP addressing schemes, and routing documentation
  • Develop operational runbooks for network troubleshooting, load balancer management, and monitoring response procedures
  • Document monitoring alert thresholds, escalation procedures, and incident response playbooks
  • Maintain network and monitoring configuration baselines for compliance and change management
  • Collaborate with container platform team on networking requirements and service mesh integration
  • Work with developers on application health check design and monitoring instrumentation
  • Partner with testing team on performance monitoring and load testing metric collection
  • Support security teams with network traffic analysis and security event investigation

Benefits

  • Health Care Plan (Medical, Dental & Vision)
  • Retirement Plan (401k)
  • Life Insurance (Basic, Voluntary & AD&D)
  • Paid Time Off
  • Family Leave (Maternity, Paternity)
  • Short Term & Long-Term Disability
  • Training & Development
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service