Senior AI/ML Engineer (GenAI & LLM Systems)

Tekhqs•California, CA

47d

About The Position

TEKHQS is seeking a Senior AI/ML Engineer to design, fine-tune, and deploy production-grade Generative AI and LLM-powered systems . This role is ideal for engineers who have shipped real-world ML systems , understand modern transformer architectures, and can operate across the full ML lifecycle—from data and training to inference and optimization. You will work on scalable AI platforms , enterprise-grade GenAI solutions, and intelligent systems integrated into Web, ERP, and enterprise workflows . This is a hands-on role with strong ownership and architectural influence.

Requirements

Core AI / ML Strong experience with PyTorch and transformer architectures.
Hands-on experience with LLMs, embeddings, fine-tuning (LoRA/QLoRA) , and prompt engineering.
Solid understanding of training vs inference tradeoffs , evaluation metrics, and model behavior.
GenAI & Systems Experience with RAG pipelines , vector databases (Pinecone, Weaviate, FAISS, Chroma).
Familiarity with RLHF concepts (DPO, PPO, reward modeling) is a plus.
Tokenization concepts (BPE, SentencePiece, Tiktoken).
Model Optimization & Deployment Quantization and optimization techniques (GPTQ, AWQ, int8, fp16).
Model serving using vLLM, Triton, HuggingFace TGI , or similar.
Experience deploying models on AWS, Azure, or GCP .
Data & Infrastructure Distributed training or inference using DeepSpeed, FSDP, Accelerate .
Data pipelines using Parquet, WebDataset, or cloud storage .
CI/CD for ML workflows.
Software Engineering Strong Python engineering practices.
Docker and Kubernetes for ML workloads.
Experience with monitoring, logging, and profiling ML systems.

Nice To Haves

Experience with ERP-integrated AI solutions (NetSuite, SAP, Dynamics).
Exposure to multi-agent systems , orchestration frameworks, or AutoGen/LangGraph.
Open-source contributions or published technical work.

Responsibilities

Design, fine-tune, and optimize transformer-based models (GPT, LLaMA, Mistral, T5) for production use cases.
Build and maintain end-to-end GenAI pipelines : data processing, training, evaluation, deployment, and monitoring.
Implement Retrieval-Augmented Generation (RAG) systems using vector databases and hybrid search.
Optimize inference for latency, throughput, and cost efficiency .
Work with multi-modal AI (text, embeddings, images, audio where applicable).
Integrate AI services into enterprise applications, ERP systems, and SaaS platforms .
Collaborate with product, backend, and cloud teams to deliver scalable AI solutions.
Apply best practices in ML governance, security, and responsible AI .

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume