Sr. Staff AI Engineer - On-Prem AI Infrastructure & Agentic Systems

SK hynix memory solutions America Inc.San Jose, CA
$140,000 - $165,000Onsite

About The Position

We are seeking a hands-on AI Engineer to design, deploy, and maintain on-prem AI infrastructure and build agentic AI systems that drive real-world automation. You’ll be responsible for setting up scalable AI environments, implementing RAG pipelines, fine-tuning embedded models, and architecting AI agents that operate autonomously in enterprise settings. This role sits at the intersection of AI systems engineering and applied ML — you’ll bridge infrastructure, model deployment, and agent logic.

Requirements

  • 2+ years of experience in AI/ML engineering, with hands-on deployment of AI systems on-prem or private cloud.
  • Proven experience building agentic AI systems — including state management, tool integration, and multi-step reasoning.
  • Strong working knowledge of RAG architectures — chunking, retrieval, re-ranking, evaluation metrics.
  • Experience with model fine-tuning (LoRA, QLoRA, full fine-tuning) and embedding models for retrieval.
  • Familiarity with Model Control Protocols (MCP) or similar governance frameworks (model versioning, access control, audit trails).
  • Proficiency in Python, Linux, Docker/Kubernetes, and vector databases (e.g., Milvus, Qdrant, Pinecone).
  • Experience with AI serving frameworks (vLLM, TGI, Triton, Ollama, etc.).

Nice To Haves

  • Experience deploying AI in enterprise storage or hardware-adjacent environments.
  • Background in systems engineering or QA automation — bonus if you’ve used AI to automate testing or validation.
  • Familiarity with embedded AI or edge inference (ONNX, TensorRT, GGUF, etc.).
  • Experience with AI agent frameworks (LangGraph, AutoGen, BabyAGI, etc.).
  • Knowledge of AI observability tools (LangSmith, Weights & Biases, Prometheus/Grafana for AI).
  • As a Storage company, knowledge of storage area/NVMe is a PLUS.

Responsibilities

  • Design and deploy on-prem AI infrastructure — including GPU clusters, model serving (e.g., vLLM, TGI, Triton), vector DBs (e.g., Milvus, Qdrant, FAISS), and orchestration (Kubernetes, Helm, Docker).
  • Build and optimize RAG pipelines — including document chunking, retrieval strategies (hybrid, re-ranking), and evaluation of retrieval accuracy and latency.
  • Develop agentic AI systems — design stateful agents with memory, tool use, and planning capabilities (e.g., using LangGraph, AutoGen, or custom frameworks).
  • Fine-tune and deploy embedded models — work with LoRA, QLoRA, or full fine-tuning for domain-specific tasks; optimize for edge/on-device inference.
  • Implement Model Control Protocols (MCP) — ensure model governance, versioning, access control, and monitoring for production AI systems.
  • Collaborate with product and engineering teams to integrate AI capabilities into enterprise workflows — especially in storage, QA, or systems engineering contexts.
  • Automate and monitor AI pipelines — build CI/CD for model deployment, logging, and performance tracking.

Benefits

  • medical
  • dental
  • vision
  • life insurance
  • company 401(k) match
  • cafeteria
  • onsite gym
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service