AI Cloud Security and Infrastructure Engineer

Troutman Pepper•Atlanta, GA

49d•$130,000 - $150,000

About The Position

The AI Cloud Security and Infrastructure Engineer designs, implements, and maintains secure, scalable cloud environments optimized for AI‑driven applications and focuses on enabling the eMerge Custom Solutions team to securely build and deploy solutions powered by Large Language Models (LLMs) and other advanced AI technologies. The AI Cloud Security and Infrastructure Engineer ensures strong security, compliance, and performance across hybrid infrastructures while fostering innovation in AI development workflows.

Requirements

Deep expertise in PKI, encryption standards, and cryptographic hashing methods.
Advanced understanding of core networking concepts, DNS, and securing web services.
Solid knowledge of key compliance frameworks, including ISO 27001, ISO 27701, ISO 42001, and CMMC L2.
Exceptional written and verbal communication skills, with the ability to effectively engage both technical and non-technical stakeholders and the ability to clearly communicate complex technical concepts to non-technical stakeholders.
Demonstrated reliability and integrity in managing and safeguarding sensitive information.
Understanding of data governance principles and secure end-to-end model lifecycle management.
Familiarity with prompt engineering practices and techniques for optimizing large language models (LLMs).
Strong skills in creating clear documentation and delivering effective presentations.
Proven capability to troubleshoot and solve problems under pressure and tight deadlines.
Excellent team-oriented service approach and professional client-facing demeanor.
Bachelor’s degree, or equivalent combination of training, education, and experience that demonstrates the ability to perform the core responsibilities of the role.
Minimum five (5) years of experience in cloud infrastructure engineering with a strong emphasis on security.
Relevant certifications, such as Microsoft Certified: Azure Security Engineer Associate, Certified Information Systems Security Professional (CISSP), Certified Cloud Security Professional (CCSP), Certified Information Security Manager (CISM), AI Governance Professional (AIGP).
Experience designing and managing AI/ML pipelines and implementing DevSecOps practices.
Experience securing containerized environments, including Kubernetes and Docker.
Hands-on expertise with Kubernetes, Docker, and infrastructure-as-code (IaC) tools (e.g., Terraform, ARM templates).
Experience working with MLOps platforms such as MLflow and Kubeflow.
Experience implementing and operating SIEM solutions, preferably Microsoft Sentinel.

Responsibilities

Design and run secure, highly scalable cloud infrastructures (Azure, AWS, GCP) tailored for AI workloads and data-intensive applications.
Build and manage infrastructure-as-code (IaC) using tools such as Terraform, Pulumi, Azure ARM templates.
Develop, implement, and enforce security best practices for AI pipelines, including data encryption, identity and access management, PKI, and zero-trust architectures.
Maintain compliance with key industry standards, frameworks, and regulations (ISO 27001, ISO 27701, ISO 42001, GDPR, CCPA, CMMC Level 2).
Partner with software engineering and cross-functional teams to secure CI/CD pipelines and clearly communicate technical risk and compliance information to executive stakeholders.
Secure, optimize, and continuously monitor containerized environments for LLM deployments leveraging Docker and Kubernetes.
Support infrastructure for fine-tuning, hosting, and scaling LLMs (OpenAI, Hugging Face, Azure OpenAI Service), including management of GPU/TPU resources.
Configure and oversee DNS, core networking services, and SIEM platforms (e.g., Microsoft Sentinel) to support threat detection, incident triage, and compliance reporting.
Perform security and risk assessments for AI infrastructure and data pipelines and drive the execution of mitigation strategies.
Implement and maintain observability tools (Datadog, Prometheus, Grafana) for AI systems and lead incident response efforts for security breaches and critical system outages.
Support with data asset inventory, tracking, and media auditing to strengthen data governance and control.
Maintain strict confidentiality of information.