DevOps/MLOps Engineer

NiyamITAshburn, VA
3dHybrid

About The Position

Niyam is seeking a DevOps/MLOps Engineer to join our team in support of our work with a federal client. This role is responsible for designing, automating, and maintaining scalable infrastructure and deployment pipelines that support both traditional software development and AI/ML model lifecycles. The ideal candidate brings a strong foundation in DevOps practices, cloud-native technologies, and MLOps frameworks, with a focus on automation, reliability, and secure operations within regulated environments. This position requires close collaboration with software engineers, data scientists, and cybersecurity teams to deliver resilient and compliant solutions. We offer competitive compensation and benefits. This full-time position will be hybrid to Ashburn, VA. This position is contingent upon award of contract.

Requirements

  • US Citizenship with ability to obtain a Public Trust.
  • Bachelor’s degree or higher in Computer Science, Engineering, Information Technology, or a related technical discipline from an accredited institution.
  • Minimum of 4 years of experience in DevOps, Site Reliability Engineering (SRE), MLOps, or a related field supporting enterprise or mission-critical systems.
  • Hands-on experience designing and maintaining CI/CD pipelines using tools such as Jenkins, GitLab CI/CD, GitHub Actions, or similar.
  • Experience with Infrastructure as Code (IaC) tools such as Terraform, CloudFormation, or Ansible.
  • Experience working with cloud platforms such as AWS, Microsoft Azure, or Google Cloud Platform (GCP).
  • Proficiency in containerization technologies such as Docker and orchestration platforms such as Kubernetes.
  • Experience with monitoring and logging tools (e.g., Prometheus, Grafana, ELK stack, or similar).
  • Familiarity with supporting AI/ML workflows and implementing MLOps practices is highly preferred.
  • Understanding of security best practices and experience working within regulated or federal environments (e.g., NIST, FedRAMP) is preferred.
  • Strong problem-solving, troubleshooting, and communication skills, with the ability to collaborate across cross-functional teams.
  • Local to Ashburn, VA and available to work onsite as needed.

Nice To Haves

  • Experience supporting federal agencies or working in government contracting environments.
  • Familiarity with model versioning and experiment tracking tools (e.g., MLflow, SageMaker, or similar).
  • Experience with scripting or programming languages such as Python, Bash, or Go.
  • Knowledge of microservices architecture and API-driven development.
  • Experience implementing automated security scanning and compliance checks within CI/CD pipelines.

Responsibilities

  • Design, implement, and maintain robust CI/CD pipelines to support continuous integration and delivery of both application code and AI/ML models across development, testing, and production environments.
  • Automate infrastructure provisioning, configuration management, and deployment processes using Infrastructure as Code (IaC) tools to ensure consistency, scalability, and repeatability.
  • Manage and optimize cloud-based environments, leveraging platforms such as AWS, Azure, or GCP to support high availability, fault tolerance, and cost efficiency.
  • Implement and manage containerization and orchestration technologies (e.g., Docker, Kubernetes) to support scalable, portable, and resilient application and model deployments.
  • Monitor system performance, availability, and reliability using centralized logging, metrics, and alerting tools; proactively identify and resolve performance bottlenecks and system issues.
  • Ensure seamless integration and promotion of code and models across development, testing, staging, and production environments through automated workflows and release management processes.
  • Collaborate with data scientists and ML engineers to operationalize machine learning models, enabling versioning, reproducibility, and continuous model delivery through MLOps best practices.
  • Implement and enforce security best practices across the DevOps lifecycle, including secure configurations, vulnerability management, and compliance with federal security standards.
  • Support system reliability engineering (SRE) practices, including incident response, root cause analysis, and continuous improvement of system resilience.
  • Document infrastructure, pipelines, and operational procedures to support maintainability, auditability, and compliance with federal standards and accreditation requirements.

Benefits

  • Flexible Work Hours: Life doesn’t always fit into a 9-to-5 schedule. We offer flexibility to help you manage your work-life balance effectively.
  • Remote Work: Niyam understands the value of flexibility. We offer remote work.
  • Career Growth: Niyam is not just a job; it’s a career journey. We provide a supportive environment for your professional development and offer fully paid opportunities for training and advancement within the company.
  • Great People: Our people are the blueprint of who Niyam is to the industry and community.
  • Great Environment: Niyam fosters a great environment where innovation, collaboration, and personal growth thrive.
  • Diversity & Inclusion: We believe in the strength of diverse perspectives. Your unique ideas are welcomed and celebrated every day at Niyam.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service