Senior AI/ML Cloud Engineer

VanguardMalvern, PA
Hybrid

About The Position

In This Role You Will: Design, develop, and maintain Generative AI platforms across multiple cloud providers (e.g., AWS, Azure, and GCP), including supporting infrastructure and tooling. Contribute to the development of our AI platform, ensuring scalability, reliability, and efficiency; support full-stack applications used to manage the platform. Monitor platform performance and implement optimization strategies. Present proofs of concept, demonstrate solutions to stakeholders, and partner with project managers, engineering managers, architects, and application teams. Stay current with industry trends and advancements in Generative AI and agentic technologies; document processes, issues, and solutions to support knowledge sharing. Prepare for future initiatives to enable new AI capabilities within the enterprise platform. Provide support to platform users by troubleshooting issues, collaborating cross-functionally, delivering enhancements, and responding to inquiries. Collaborate closely with cross-functional teams including product managers and technical leads to define and achieve platform goals. Our Tech Stack: We welcome candidates from diverse backgrounds who are passionate about AI and DevOps. Technologies you may work with include: AWS Services: Bedrock, Bedrock AgentCore, SageMaker, CloudFormation, Lambda, Step Functions, IAM, S3, CloudWatch Other Cloud Platforms: Microsoft Azure OpenAI, GCP Vertex Programming & Scripting: Python CI/CD & Code Management: Git, GitHub, GitHub Actions Bonus Skills: Terraform Experience developing and deploying AI agents, large language model (LLM) integrations, and retrieval-augmented generation (RAG) techniques Experience provisioning Generative AI cloud services on AWS, Azure, and/or GCP

Requirements

  • Minimum of five years related work experience.
  • Undergraduate degree in a related field or the equivalent combination of training and experience.

Nice To Haves

  • Terraform
  • Experience developing and deploying AI agents, large language model (LLM) integrations, and retrieval-augmented generation (RAG) techniques
  • Experience provisioning Generative AI cloud services on AWS, Azure, and/or GCP

Responsibilities

  • Design, implement, and deploy cloud platforms to meet business needs.
  • Identify, recommend, and implement improvements and solutions to performance issues.
  • Provide advanced technical support and ensure reliable operation of cloud production environments.
  • Troubleshoot software systems across multiple cloud platforms and support system integration.
  • Maintain comprehensive technical knowledge of software and infrastructure platforms.
  • Develop technical standards.
  • Test and evaluate IT vendor products.
  • Review configuration parameters to optimize system performance and maximize uptime.
  • Promote code through development, test, and cloud production environments on schedule.
  • Provide follow-up production support.
  • Submit change-control requests and maintain required documentation.
  • Learns and understands client area business functions and requirements.
  • Determines the appropriate technical tool to address the client's business needs.
  • Trains and mentors more junior staff on processes and releases.
  • Troubleshoots and resolves complex issues elevated from staff.
  • Provides guidance and consultation as required.
  • Updates, writes, and maintains documentation for the department.
  • Administer system activities.
  • Writes the technical portion of assigned deliverables.
  • Performs systems analysis, including system requirements analysis and definition, and logical design.
  • Participates in special projects and performs other duties as assigned.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service