About The Position

TEKHQS is seeking a Senior AI/ML Engineer to design, fine-tune, and deploy production-grade Generative AI and LLM-powered systems . This role is ideal for engineers who have shipped real-world ML systems , understand modern transformer architectures, and can operate across the full ML lifecycle—from data and training to inference and optimization. You will work on scalable AI platforms , enterprise-grade GenAI solutions, and intelligent systems integrated into Web, ERP, and enterprise workflows . This is a hands-on role with strong ownership and architectural influence.

Requirements

  • Core AI / ML Strong experience with PyTorch and transformer architectures.
  • Hands-on experience with LLMs, embeddings, fine-tuning (LoRA/QLoRA) , and prompt engineering.
  • Solid understanding of training vs inference tradeoffs , evaluation metrics, and model behavior.
  • GenAI & Systems Experience with RAG pipelines , vector databases (Pinecone, Weaviate, FAISS, Chroma).
  • Familiarity with RLHF concepts (DPO, PPO, reward modeling) is a plus.
  • Tokenization concepts (BPE, SentencePiece, Tiktoken).
  • Model Optimization & Deployment Quantization and optimization techniques (GPTQ, AWQ, int8, fp16).
  • Model serving using vLLM, Triton, HuggingFace TGI , or similar.
  • Experience deploying models on AWS, Azure, or GCP .
  • Data & Infrastructure Distributed training or inference using DeepSpeed, FSDP, Accelerate .
  • Data pipelines using Parquet, WebDataset, or cloud storage .
  • CI/CD for ML workflows.
  • Software Engineering Strong Python engineering practices.
  • Docker and Kubernetes for ML workloads.
  • Experience with monitoring, logging, and profiling ML systems.

Nice To Haves

  • Experience with ERP-integrated AI solutions (NetSuite, SAP, Dynamics).
  • Exposure to multi-agent systems , orchestration frameworks, or AutoGen/LangGraph.
  • Open-source contributions or published technical work.

Responsibilities

  • Design, fine-tune, and optimize transformer-based models (GPT, LLaMA, Mistral, T5) for production use cases.
  • Build and maintain end-to-end GenAI pipelines : data processing, training, evaluation, deployment, and monitoring.
  • Implement Retrieval-Augmented Generation (RAG) systems using vector databases and hybrid search.
  • Optimize inference for latency, throughput, and cost efficiency .
  • Work with multi-modal AI (text, embeddings, images, audio where applicable).
  • Integrate AI services into enterprise applications, ERP systems, and SaaS platforms .
  • Collaborate with product, backend, and cloud teams to deliver scalable AI solutions.
  • Apply best practices in ML governance, security, and responsible AI .
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service