Senior Machine Learning Engineer

webAI•Washington, DC

69d

About The Position

We are seeking a Senior Machine Learning Engineer to support our Public Sector initiatives focused on building and optimizing production ready AI systems for secure and distributed environments. You will be responsible for transforming prototype models into scalable, efficient, and reliable production systems that operate seamlessly across a spectrum of hardware from government cloud infrastructure to edge devices in restricted or disconnected environments.

Requirements

Active US Security clearance
4+ years of experience in applied AI, ML engineering, or production AI systems.
Deep proficiency in PyTorch, TensorFlow, or Hugging Face Transformers.
Proven experience deploying AI models across cloud, edge, and mobile hardware environments.
Expertise in model compression and optimization (quantization, pruning, distillation).
Experience building RAG pipelines and integrating vector databases (e.g., Quadrant, ChromaDB, FAISS, Milvus, Pinecone).
Familiarity with multi-modal models and synthetic data generation methods.
Strong algorithmic and problem solving skills, especially in distributed or constrained compute environments.

Nice To Haves

Experience with edge AI, federated learning, or offline inference systems.
Understanding of AI governance and compliance frameworks relevant to public sector deployments.
Experience integrating models into large scale distributed systems or microservice architectures.
Excellent communication and technical documentation skills for collaboration across multi disciplinary teams.
Strong understanding of GPU computing, CUDA, and performance profiling.

Responsibilities

Design, develop, and deploy agentic workflows to orchestrate multi-step reasoning, tool use, and decision-making across production systems.
Productionize AI models from research prototypes into scalable, deployable systems used in real world applications.
Engineer adaptive ML systems using LoRA, PEFT, and on-device inference strategies, leveraging PyTorch, TensorFlow, and Hugging Face Transformers for model development, fine-tuning, and optimization.
Implement model optimization techniques such as quantization, pruning, distillation, and hardware specific acceleration.
Build and maintain Retrieval Augmented Generation (RAG) pipelines, including vector database integration for contextual retrieval.
Work with multi-modal AI systems across computer vision, audio, and natural language domains.
Optimize model execution for distributed and resource constrained environments, ensuring reliability under variable connectivity conditions.