Intern - AI Engineering SLM

Veolia Environnement SA•Paramus, NJ

4h•$21 - $25

About The Position

Student Exploration and Experience Development (SEED) is a 12-week internship opportunity at Veolia for students to gain hands-on experience in sustainability and ecological transformation. They will work on real-world projects, receive mentorship from industry professionals, and participate in workshops and networking events. The program aims to nurture talent, promote innovation, and foster meaningful connections between students and industry professionals. Overall, the SEED program provides students with the skills, knowledge, and connections needed to make a positive impact in the industry. Program Dates: June 1, 2026 to August 21, 2026. Position Purpose: We are seeking a motivated AI Engineering intern to support the development and implementation of an AI-powered agent for This role offers hands-on experience with cutting-edge Small language models, cloud infrastructure, and enterprise software development.

Requirements

Working towards a PhD degree and you have in AI/ML/Computer Science.
3.8 Cumulative G.P.A required.
Strong communication skills, including written, verbal, listening, presentation and facilitation skills.
Demonstrated ability to build collaborative relationships.

Responsibilities

Understanding and working with lightweight models such as Phi-3 (Microsoft), Llama(Meta), Mistral (Mistral), Gemma (Google), and TinyLlama (for resource-constrained environments).
Using Figma for UI/UX design and workflow planning.
Utilizing Jupyter Notebooks and pandas for data exploration, cleaning, and analysis.
Leveraging Hugging Face Model Hub to compare, select, and download pre-trained models.
Building SLM-powered applications using LangChain or Lanngraph for orchestration and workflow management.
Deploying models locally with Ollama or at scale with vLLM for efficient inference.
Implementing semantic search and retrieval-augmented generation (RAG) using ChromaDB.
Developing RESTful APIs with FastAPI (Python) or Express.js (Node.js) to expose model functionality
Coding in VS Code, ideally with Python and Copilot extensions for productivity.
Customizing models using Hugging Face Transformers and parameter-efficient fine-tuning (PEFT) methods like LoRA/QLoRA.
Managing and optimizing prompts with LangSmith or PromptLayer.
Tracking code and data changes with Git and DVC (Data Version Control).
Logging experiments, metrics, and results with MLflow or Weights & Biases.
Writing and running tests using pytest or unittest to ensure code correctness.
Assessing model performance with RAGAS (for RAG pipelines) and DeepEval.
Simulating high-traffic scenarios using Locust or Apache JMeter.
Using LangChain Evaluators and custom metrics to ensure output quality and reliability.
Packaging applications with Docker or Docker Compose for portability and reproducibility.
Managing and scaling containers with Kubernetes (K8s) or Docker Swarm.
Accelerating inference and reducing resource usage with ONNX Runtime, OpenVINO, or llama.cpp.
Managing and securing APIs with Apigee.
Monitoring application and model performance with LangSmith.
Automating build, test, and deployment pipelines with GitHub Actions or GitLab CI.
Deploying solutions on local servers with GPU or CPU resources.
Utilizing GCP (Vertex AI) for scalable, managed AI infrastructure.
Integrating edge devices for local inference with cloud backup for resilience and scalability.