Senior AI/ML Engineer (GenAI & LLM Systems)

TekhqsCalifornia City, CA
3d

About The Position

TEKHQS is seeking a Senior AI/ML Engineer to design, fine-tune, and deploy production-grade Generative AI and LLM-powered systems. This role is ideal for engineers who have shipped real-world ML systems, understand modern transformer architectures, and can operate across the full ML lifecycle—from data and training to inference and optimization. You will work on scalable AI platforms, enterprise-grade GenAI solutions, and intelligent systems integrated into Web, ERP, and enterprise workflows. This is a hands-on role with strong ownership and architectural influence.

Requirements

  • Strong experience with PyTorch and transformer architectures.
  • Hands-on experience with LLMs, embeddings, fine-tuning (LoRA/QLoRA), and prompt engineering.
  • Solid understanding of training vs inference tradeoffs, evaluation metrics, and model behavior.
  • Experience with RAG pipelines, vector databases (Pinecone, Weaviate, FAISS, Chroma).
  • Familiarity with RLHF concepts (DPO, PPO, reward modeling) is a plus.
  • Tokenization concepts (BPE, SentencePiece, Tiktoken).
  • Quantization and optimization techniques (GPTQ, AWQ, int8, fp16).
  • Model serving using vLLM, Triton, HuggingFace TGI, or similar.
  • Experience deploying models on AWS, Azure, or GCP.
  • Distributed training or inference using DeepSpeed, FSDP, Accelerate.
  • Data pipelines using Parquet, WebDataset, or cloud storage.
  • CI/CD for ML workflows.
  • Strong Python engineering practices.
  • Docker and Kubernetes for ML workloads.
  • Experience with monitoring, logging, and profiling ML systems.
  • Bachelors or Masters degree in Computer Science, AI, Data Science, or related field.
  • 4+ years of professional ML experience, with 3+ years in GenAI/LLMs.
  • Proven experience deploying AI systems to production.

Nice To Haves

  • Experience with ERP-integrated AI solutions (NetSuite, SAP, Dynamics).
  • Exposure to multi-agent systems, orchestration frameworks, or AutoGen/LangGraph.
  • Open-source contributions or published technical work.

Responsibilities

  • Design, fine-tune, and optimize transformer-based models (GPT, LLaMA, Mistral, T5) for production use cases.
  • Build and maintain end-to-end GenAI pipelines: data processing, training, evaluation, deployment, and monitoring.
  • Implement Retrieval-Augmented Generation (RAG) systems using vector databases and hybrid search.
  • Optimize inference for latency, throughput, and cost efficiency.
  • Work with multi-modal AI (text, embeddings, images, audio where applicable).
  • Integrate AI services into enterprise applications, ERP systems, and SaaS platforms.
  • Collaborate with product, backend, and cloud teams to deliver scalable AI solutions.
  • Apply best practices in ML governance, security, and responsible AI.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service