Machine Learning Software Developer - Foundational & Agentic AI

Circle Cardiovascular Imaging

19d•Onsite

About The Position

We are seeking a Machine Learning Software Developer to build and deploy production-grade AI systems for our flagship clinical software. This role focuses on foundation models and agentic AI workflows that support clinical reporting, findings summarization, structured outputs, and conversational assistance. You will develop and scale LLM-based systems using retrieval-augmented generation, tool integration, structured outputs, and orchestration frameworks across local and cloud environments. Success in this role requires strong attention to reliability, observability, safety, and backend integration within a regulated clinical setting.

Requirements

4+ years of experience building and deploying machine learning or AI systems in production.
Strong expertise in deep learning architectures, including Transformers and diffusion models, with proficiency in PyTorch.
Hands-on experience building agentic and LLM-based applications using retrieval-augmented generation, structured outputs, function calling, workflow orchestration, and evaluation frameworks.
Experience with distributed training and optimization in HPC or cloud environments using frameworks such as PyTorch Distributed, Ray, DeepSpeed, Megatron, or CUDA.
Strong Python and software engineering skills, including testing, debugging, version control, and experience building REST APIs, backend services, or microservices.

Nice To Haves

Hands-on experience training/finetuning large foundation models in distributed compute environments.
Familiarity with multi-agent systems, workflow engines, graph-based orchestration frameworks, and cloud platforms such as AWS, Azure, or GCP.
Proficiency in MLOps or LLMOps tooling such as Docker, Kubernetes, MLflow, Airflow, CI/CD pipelines, or model monitoring systems.
Background in healthcare, biomedical imaging, or other regulated software environments, including translating research into product features.

Responsibilities

Build and scale training pipelines in collaboration with Research Scientists, translating experimental ideas into production-grade ML systems.
Design and deploy agentic and LLM-powered workflows for clinical reporting, summarization, structured outputs, and conversational assistance using tool integration, function calling, structured outputs, and orchestration frameworks.
Develop retrieval-augmented generation pipelines and backend services that integrate AI capabilities into a secure, scalable C++-based platform.
Establish evaluation, observability, and monitoring practices to measure and improve quality, factuality, safety, latency, reliability, and runtime performance.
Support local and cloud deployment of models and inference services with a focus on privacy, resilience, maintainability, and strong engineering practices.