Cloud & AI Infrastructure Engineer

Hanwha Energy USAHouston, TX
Onsite

About The Position

Hanwha Energy USA, headquartered in Houston, Texas, is part of the Hanwha Group, a FORTUNE Global 300 company. With over a decade of experience, Hanwha Energy USA has evolved into a comprehensive energy solutions provider, spanning utility-scale renewables, natural gas generation, retail electricity, and strategic partnerships for the data center industry. Their expertise covers the entire energy value chain from project development to operations and maintenance, integrating advanced technologies and partnerships to deliver reliable, customized solutions. They are actively advancing strategic initiatives in natural gas generation and data center development, including hyperscaler solutions, and are the parent company of Hanwha Renewables and Chariot Energy. The company is seeking a Cloud & AI Infrastructure Engineer to design, build, operate, and optimize the Azure and AI platform supporting internal applications, AI agents, copilots, automations, and enterprise data workloads. This role focuses on providing a secure, reliable, and cost-effective cloud foundation for scaling AI and application landscapes. The ideal candidate will have strong hands-on experience with Azure infrastructure, infrastructure-as-code, CI/CD, monitoring, observability, and cloud cost optimization. This engineer will collaborate with Application Development and Data Management to ensure AI-enabled solutions are production-ready, well-governed, and operationally supportable, establishing modern cloud engineering discipline around provisioning, deployment, logging, monitoring, security, and cost control for AI-enabled systems. The employee may also perform other job-related duties as requested by management.

Requirements

  • Bachelor’s degree in Information Systems, Computer Science, Engineering, or a related field. Equivalent experience may be considered.
  • 4+ years of experience in cloud engineering, platform engineering, DevOps, or infrastructure engineering.
  • Strong hands-on experience with Microsoft Azure services and cloud architecture.
  • Experience with infrastructure-as-code such as Terraform, Bicep, ARM templates, or similar tools.
  • Experience building and maintaining CI/CD pipelines for application and platform deployment.
  • Experience implementing monitoring, logging, alerting, and observability for production systems.
  • Understanding of identity, access control, security, and networking concepts in cloud environments.
  • Experience supporting production applications and cloud platforms with a focus on reliability and operational excellence.
  • Strong troubleshooting and problem-solving skills.
  • Ability to work effectively with software developers, data engineers, and technical leadership.

Nice To Haves

  • Experience supporting AI or machine learning workloads in Azure.
  • Familiarity with Azure OpenAI and the infrastructure patterns needed to support AI-enabled applications.
  • Experience with containerization, orchestration, or modern hosting models.
  • Experience with FinOps, cloud cost optimization, tagging governance, or cost reporting.
  • Experience with secrets management, policy enforcement, and secure deployment patterns.
  • Experience in environments with custom internal applications, integrations, and growing cloud platforms.
  • Familiarity with observability and telemetry practices for distributed systems and AI applications.

Responsibilities

  • Design and provision Azure and AI resources using infrastructure-as-code, ensuring consistency, scalability, and maintainability.
  • Establish and enforce standards for networking, identity, monitoring, security, tagging, and cloud resource governance.
  • Build and maintain CI/CD pipelines for AI-enabled applications, internal platforms, agents, and automations.
  • Implement and support logging, telemetry, monitoring, and observability practices for AI and cloud-based workloads.
  • Partner with Application Development and Data Management to ensure solutions are deployed and operated in a secure, reliable, and supportable manner.
  • Manage Azure and AI platform environments across development, testing, and production.
  • Monitor and optimize Azure and Azure OpenAI usage and spend, right-size resources, and help improve cloud cost visibility and efficiency.
  • Partner with Finance and IT leadership on cloud cost reporting, chargeback visibility, and optimization opportunities.
  • Support identity, access, secrets, and security practices required for enterprise cloud and AI workloads.
  • Help define and maintain operational standards for backup, resiliency, recovery, alerting, and support readiness.
  • Troubleshoot deployment, performance, connectivity, and infrastructure issues affecting applications, agents, and cloud services.
  • Contribute to architecture discussions related to hosting, integration, scaling, environment design, and operational controls.
  • Document cloud patterns, standards, deployment approaches, and support procedures in a clear and practical way.
  • Evaluate and introduce appropriate tools and patterns that improve reliability, delivery speed, visibility, and cost management.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service