AI/ML Engineer

CCS INC•Plano, TX

11d•$65 - $75

About The Position

This role focuses on designing and implementing scalable, secure AWS architectures for LLM and GenAI platforms. The engineer will lead the integration of API-based and self-hosted LLMs, implement RAG solutions, and develop prompt engineering strategies. Responsibilities also include developing and maintaining vector stores, data ingestion and processing pipelines, microservices, serverless systems, and Python development for AI tooling, while ensuring security, governance, and cross-functional leadership.

Requirements

Bachelor’s Degree
6+ years cloud architecture experience
3+ years building production GenAI/LLM systems on AWS.
Strong Python and AWS expertise, including Lambda, ECS/EKS, S3, SageMaker, Docker and Kubernetes.
Production experience with vector databases and designing ingestion + embedding pipelines for both batch and streaming workloads.
Hands-on with prompt design, evaluation, LLM orchestration, and RAG implementation patterns.
Experience deploying and operating model- serving or MCP – like server infrastructure (selfhosted or managed).
Proficient with IaC and delivery tooling, including Terraform/CloudFormation, GitOps, and CI pipelines.
Experience with model-serving infrastructure, such as Amazon SageMaker, NVIDIA Triton, Ray Serve, or similar platforms.
Hands-on experience with GenAI libraries and frameworks, including LangChain, LlamaIndex, Hugging Face, and OpenAI APIs.
Deep operational expertise with vector databases, such as Pinecone, Milvus, Weaviate, or Qdrant.
AWS Solutions Architect, AWS DevOps Engineer, or equivalent industry certifications.

Responsibilities

Cloud Architecture & Infrastructure, Design scalable, secure AWS architectures
LLM & GenAI Platforms, Lead integration of API-based and self-hosted LLMs, implement RAG solutions
Prompting & Evaluation, Develop prompt engineering strategies, reusable templates, and evaluation frameworks
Vector Databases & Retrieval Pipelines, Implement and maintain vector stores (OpenSearch, Pinecone, Milvus, Qdrant)
Data Ingestion & Processing Pipelines
Microservices & Serverless Systems
Python Development & AI Tooling
Security, Governance & Cross-Functional Leadership