Azure Systems Administrator- SRE & Data Solutions Architect

General Dynamics Information Technology
3dRemote

About The Position

We are seeking an experienced FSDS Azure Systems Administrator- SRE & Data Solutions Architect to support and maintain a Microsoft Azure Government (IL5) cloud environment. As a technical leader, this role ensures seamless integration of Azure services to meet mission objectives and modernize cloud operations while reducing technical debt and fostering system resiliency. The goal is to design, implement, and maintain a highly secure, scalable, and efficient Microsoft Azure Government (IL5) cloud environment while supporting advanced data solutions, Site Reliability Engineering (SRE) practices, and compliance with DoD Cloud Computing SRG, NIST 800-53, and FedRAMP High standards. The position drives innovation through the adoption of AI/ML, DevOps, and Infrastructure as Code (IaC) techniques while ensuring cloud optimization, automation, and data security. The ideal candidate has deep technical expertise in Azure Government environments, with a strong focus on regulatory compliance, cloud optimization, and advanced data solutions leveraging AI/ML. They excel in implementing SRE practices, Infrastructure as Code (IaC), and automation to drive scalability, resiliency, and operational efficiency while maintaining a robust security posture. As a collaborative leader, they foster innovation, mentor teams, and ensure seamless delivery of mission-critical cloud and data solutions. If this sounds like you and you'd love to grow and learn with a team that's dedicated to excellence and a company that's committed to your advancement, then we'd love to hear from you!

Requirements

  • 10+ years of related professional experience required, with 5+ years of that experience strongly preferred to be specifically with Azure
  • Strong proficiency managing data-centric Azure environments, including workload optimization for AI/ML and advanced data solutions
  • Familiarity with frameworks and tools supporting data ingestion, transformation, and reporting (e.g., Azure Data Factory, Power BI, Databricks)
  • Strong experience managing Azure Government environments (IL5 or equivalent), including secure cloud operations
  • Experience with cloud resource monitoring, backup/recovery, and scaling operations in Azure
  • Experience with chaos engineering practices to validate system resiliency
  • Proven ability to design self-healing architectures and automate incident recovery
  • Experience with vulnerability scanning tools and security monitoring solutions, such as Microsoft Sentinel
  • Experience embedding automation with data solutions and pipelines while optimizing CI/CD for AI/ML
  • Proficiency with Infrastructure as Code tools (Bicep, ARM, Terraform) and CI/CD pipeline management (e.g., GitLab)
  • Hands-on experience with Entra ID, RBAC, Privileged Identity Management (PIM), and Conditional Access policies
  • Must possess an active Secret or higher clearance and maintain as a condition of employment
  • US Citizenship Required

Nice To Haves

  • Technical training, certification, or degree in Computer Science, Engineering, or a related field required
  • Bachelor’s degree in Computer Science, Engineering, or a related field strongly preferred; advanced degrees a plus

Responsibilities

  • Expert Technical Leadership: Deliver advanced expertise in Azure cloud technologies, Site Reliability Engineering (SRE) practices, and data solutions. This includes designing, building, and optimizing APIs, implementing robust identity and access management (IAM) systems, and establishing stringent data security protocols to protect sensitive information
  • Translate program and organizational visions into comprehensive technical architectures across Levels 0-3, leveraging tools like Microsoft Visio, Lucidchart, and Azure Architecture Center
  • Architect and manage resilient Azure environments that emphasize high availability, disaster recovery, and performance optimization using tools such as Azure Monitor, Application Insights, and Log Analytics
  • Provide hands-on leadership and mentorship to teams, enabling seamless adoption and proper utilization of advanced Azure services, such as Azure Kubernetes Service (AKS), Azure Data Factory, Logic Apps, and Cosmos DB
  • Drive adoption of Infrastructure as Code (IaC) tools, such as Terraform, ARM templates, and Bicep, to automate deployment pipelines and enforce configuration consistency across environments
  • Implement and standardize DevOps practices utilizing Azure DevOps, GitHub Actions, and CI/CD pipelines to ensure rapid and reliable delivery of applications and updates
  • Lead efforts to optimize database performance in Azure SQL, PostgreSQL, and Cosmos DB environments, while ensuring efficient data structuring, warehousing, and analytics capabilities
  • Conduct detailed technical training sessions and workshops to upskill team members on Azure services, DevOps best practices, and the latest technologies
  • Design and deploy self-healing workflows for increased system reliability and efficient incident recovery
  • Advanced Data Solutions & AI/ML Integration: Design and implement data-centric solutions using Azure services such as Data Factory, Synapse Analytics, and Azure Data Lake
  • Support the integration of AI/ML workloads using Azure Machine Learning services, leveraging cloud optimization for scalable processing and analysis
  • Drive the adoption of serverless computing and event-driven architectures for advanced data solutions
  • Azure Cloud Infrastructure Operations: Administer and maintain Azure Government IL5 environments across multiple subscriptions (Dev, Test, Stage, Prod)
  • Implement and maintain RBAC and least-privilege access models
  • Conduct VM size upgrades, OS upgrades, adjust infrastructure scaling, and optimization for data-processing workloads
  • Establish and maintain Service Level Indicators (SLIs), SLOs, and SLAs to ensure reliability, scalability, and performance
  • Implement chaos engineering practices to validate system reliability and resiliency.
  • Perform OS patching, Patch Management & System Maintenance, ensure patch compliance with IL5/DoD DISA STIG requirements
  • Backup & Disaster Recovery, ensure proper backup policies are applied and validate backups through disaster recovery testing
  • Maintain Recovery Time Objectives (RTO) and Recovery Point Objectives (RPO)
  • Leverage tagging strategies for cost tracking, forecasting, and resource management
  • Security & Compliance: Support automated IL5 compliance aligned with DoD Cloud Computing SRG, NIST 800-53, and FedRAMP High requirements
  • Manage Microsoft Defender for Cloud recommendations and support vulnerability scanning (e.g., Tenable/Nessus)
  • Assist with log aggregation, monitoring (Microsoft Sentinel), and Azure Monitor alerting configurations

Benefits

  • Comprehensive benefits and wellness packages
  • 401K with company match
  • Competitive pay and paid time off
  • Full-flex work week to own your priorities at work and at home
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service