Software AI Engineer - US

Jade Global•San Jose, CA

20h•Hybrid

About The Position

Software AI Engineer - US1 Job Title: Software AI Engineer/Architect Location: Santa Clara, CA (onsite preferred but remote candidates can be considered) Experience: 8- 10 yrs Job Type: Contract/ FTE This role requires deep, end-to-end understanding of how Large Language Models are built, trained, optimized, deployed, and operated. Candidates must demonstrate hands-on experience beyond consuming hosted LLM APIs, with a strong grasp of the underlying ML theory, system trade-offs, and production realities of AI/ML solutions. Mandatory Competency Areas (Non-Negotiable) 1. Foundations of LLMs (How They Actually Work) Candidate must demonstrate first-principles understanding, including: Transformer architectures (attention, embeddings, positional encoding) Tokenization strategies and their impact on cost & performance Training vs inference behavior Loss functions, pre-training objectives, and alignment techniques (SFT, RLHF) Limitations: hallucinations, bias, context collapse, long-range degradation 2. Model Development & Adaptation Hands-on experience with: Pre-training vs fine-tuning trade-offs Parameter-efficient tuning (LoRA, QLoRA, adapters) Quantization and pruning techniques Model evaluation beyond accuracy (task fitness, safety, robustness) Data curation, labeling strategies, and contamination risks. Model Development & Adaptation 3. Inference, Serving & Optimization Strong understanding of: Inference pipelines and token generation mechanics KV caching, batching, streaming responses Throughput vs latency trade-offs Memory constraints and GPU utilization strategies Model parallelism (tensor, pipeline) and their failure modes 4. End-to-End AI/ML System Design Ability to architect complete AI solutions, including: Data ingestion and preprocessing pipelines Training / fine-tuning workflows Model registry, versioning, and lineage Deployment strategies (canary, A/B, shadow traffic) Feedback loops for continuous improvement 5. Retrieval, Memory & Tool-Augmented Systems In-depth experience with: Retrieval-Augmented Generation (RAG) design Embeddings lifecycle management Vector databases and hybrid retrieval Prompt/tool orchestration and agentic workflows Failure modes of RAG and mitigation strategies 6. MLOps, Observability & Reliability Strong ownership mindset for production AI: Monitoring model quality drift and regressions Debugging hallucinations and retrieval failures Logging prompts, responses, and model metadata Cost tracking and optimization (token economics) Incident response for AI systems 7. Security, Ethics & Governance Clear understanding of: Prompt injection and data leakage risks Training data privacy and IP protection Model abuse, misuse, and guardrails Regulatory and compliance considerations Responsible AI principles in production systems Working at Jade Global Talented people are drawn to world-class organizations that offer outstanding opportunities, and Jade Global is an employer of choice for individuals around the world. We invest in each employee’s personal and professional wellbeing because we understand that client success, as well as our ultimate success, starts with our employees. We seek to provide the benefits you need while standing behind you every step of the way. Our programs include health-related policies and leave donation policy.

Requirements

8- 10 yrs experience
Deep, end-to-end understanding of how Large Language Models are built, trained, optimized, deployed, and operated.
Demonstrate first-principles understanding, including: Transformer architectures (attention, embeddings, positional encoding)
Tokenization strategies and their impact on cost & performance
Training vs inference behavior
Loss functions, pre-training objectives, and alignment techniques (SFT, RLHF)
Limitations: hallucinations, bias, context collapse, long-range degradation
Hands-on experience with: Pre-training vs fine-tuning trade-offs
Parameter-efficient tuning (LoRA, QLoRA, adapters)
Quantization and pruning techniques
Model evaluation beyond accuracy (task fitness, safety, robustness)
Data curation, labeling strategies, and contamination risks.
Strong understanding of: Inference pipelines and token generation mechanics
KV caching, batching, streaming responses
Throughput vs latency trade-offs
Memory constraints and GPU utilization strategies
Model parallelism (tensor, pipeline) and their failure modes
Ability to architect complete AI solutions, including: Data ingestion and preprocessing pipelines
Training / fine-tuning workflows
Model registry, versioning, and lineage
Deployment strategies (canary, A/B, shadow traffic)
Feedback loops for continuous improvement
In-depth experience with: Retrieval-Augmented Generation (RAG) design
Embeddings lifecycle management
Vector databases and hybrid retrieval
Prompt/tool orchestration and agentic workflows
Failure modes of RAG and mitigation strategies
Strong ownership mindset for production AI: Monitoring model quality drift and regressions
Debugging hallucinations and retrieval failures
Logging prompts, responses, and model metadata
Cost tracking and optimization (token economics)
Incident response for AI systems
Clear understanding of: Prompt injection and data leakage risks
Training data privacy and IP protection
Model abuse, misuse, and guardrails
Regulatory and compliance considerations
Responsible AI principles in production systems