Senior Machine Learning Ops Engineer

Steampunk•McLean, VA

49d•$140,000 - $190,000

About The Position

We are seeking a Senior ML Ops Engineer who specializes in assessing and developing AI/ML infrastructure, with a focus on Generative AI (Gen AI) and Retrieval-Augmented Generation (RAG) pipelines. In this role, you will lead efforts to build, optimize, and scale advanced AI/ML infrastructure that supports Federal use cases. The ideal candidate has a deep understanding of AI/ML systems and is passionate about developing production-grade pipelines that deliver results. You will contribute to the growth of our AI & Data Exploitation Practice!

Requirements

Ability to hold a position of public trust with the US government.
Master's degree in related program and 8 years of experience (7 of which must be relevant); OR Bachelor's degree in related program and 10 years of relevant experience; OR No degree and 16 years of relevant experience
Possesses at least one professional certification relevant to the technical service provided. Maintain a certification relevant to the product being deployed and/or maintained.
7+ years of experience in AI/ML infrastructure assessment, development, and deployment, with a focus on production-grade pipeline development.
Proven experience in building and scaling AI/ML pipelines, including generative AI models like GANs, VAEs, and Transformer-based architectures.
Strong proficiency in Python and best practices for scalable and efficient coding; experience with R is a plus.
Experience with AI/ML frameworks such as TensorFlow, PyTorch, Keras, or JAX and a solid understanding of neural network architectures.
Expertise in cloud platforms (AWS, Azure, GCP) and experience with AI/ML tools like AWS SageMaker, Azure OpenAI, or similar.
Practical experience with MLOps tools and frameworks, including automation of deployment, monitoring, and model management.
Familiarity with DevSecOps practices to ensure secure and compliant deployment of AI/ML solutions.
Strong knowledge of data pipeline tools and data visualization platforms, such as Tableau, Power BI, or D3.
Experience with version control (Git), Bash, Unix commands, and cloud infrastructure automation tools.

Nice To Haves

Familiarity with search technologies like Elasticsearch, AWS Kendra, or Azure Cognitive Search is a plus.

Responsibilities

Assess and design AI/ML infrastructure to support scalable and secure deployment of machine learning models, including generative AI pipelines.
Build, develop, and optimize Gen AI and RAG pipelines, ensuring seamless integration with existing systems and infrastructure.
Evaluate and improve the performance, reliability, and scalability of AI/ML pipelines, identifying and addressing bottlenecks.
Implement MLOps best practices for automating model deployment, monitoring, and retraining processes in production.
Collaborate with Data Scientists and Software Engineers to transition models from research into production-grade pipelines.
Continuously monitor AI/ML infrastructure and pipelines to ensure high performance, security, and compliance.
Use cloud-native services (AWS, Azure, GCP) to deploy scalable and cost-effective solutions, including leveraging tools such as AWS SageMaker, Azure OpenAI, and others.
Apply DevSecOps principles to maintain secure, reliable operations for AI/ML workflows, including CI/CD integration.
Stay up-to-date on the latest research, trends, and tools in AI/ML, implementing cutting-edge technologies into infrastructure solutions.
Contribute to the growth and innovation of our Data Exploitation Practice by delivering best-in-class AI/ML infrastructure solutions.