Senior AI/ML Engineer (GenAI & LLM Systems)

Tekhqs•California City, CA

27d

About The Position

TEKHQS is seeking a Senior AI/ML Engineer to design, fine-tune, and deploy production-grade Generative AI and LLM-powered systems. This role is ideal for engineers who have shipped real-world ML systems, understand modern transformer architectures, and can operate across the full ML lifecycle—from data and training to inference and optimization. You will work on scalable AI platforms, enterprise-grade GenAI solutions, and intelligent systems integrated into Web, ERP, and enterprise workflows. This is a hands-on role with strong ownership and architectural influence.

Requirements

Strong experience with PyTorch and transformer architectures.
Hands-on experience with LLMs, embeddings, fine-tuning (LoRA/QLoRA), and prompt engineering.
Solid understanding of training vs inference tradeoffs, evaluation metrics, and model behavior.
Experience with RAG pipelines, vector databases (Pinecone, Weaviate, FAISS, Chroma).
Familiarity with RLHF concepts (DPO, PPO, reward modeling) is a plus.
Tokenization concepts (BPE, SentencePiece, Tiktoken).
Quantization and optimization techniques (GPTQ, AWQ, int8, fp16).
Model serving using vLLM, Triton, HuggingFace TGI, or similar.
Experience deploying models on AWS, Azure, or GCP.
Distributed training or inference using DeepSpeed, FSDP, Accelerate.
Data pipelines using Parquet, WebDataset, or cloud storage.
CI/CD for ML workflows.
Strong Python engineering practices.
Docker and Kubernetes for ML workloads.
Experience with monitoring, logging, and profiling ML systems.
Bachelors or Masters degree in Computer Science, AI, Data Science, or related field.
4+ years of professional ML experience, with 3+ years in GenAI/LLMs.
Proven experience deploying AI systems to production.

Nice To Haves

Experience with ERP-integrated AI solutions (NetSuite, SAP, Dynamics).
Exposure to multi-agent systems, orchestration frameworks, or AutoGen/LangGraph.
Open-source contributions or published technical work.

Responsibilities

Design, fine-tune, and optimize transformer-based models (GPT, LLaMA, Mistral, T5) for production use cases.
Build and maintain end-to-end GenAI pipelines: data processing, training, evaluation, deployment, and monitoring.
Implement Retrieval-Augmented Generation (RAG) systems using vector databases and hybrid search.
Optimize inference for latency, throughput, and cost efficiency.
Work with multi-modal AI (text, embeddings, images, audio where applicable).
Integrate AI services into enterprise applications, ERP systems, and SaaS platforms.
Collaborate with product, backend, and cloud teams to deliver scalable AI solutions.
Apply best practices in ML governance, security, and responsible AI.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume