Senior Software Engineer, Computer Vision

Clarium

1d•Remote

About The Position

Clarium builds computer vision pipelines that extract structured data from clinical images under real-world conditions — variable lighting, uncontrolled image quality, and zero tolerance for silent errors. This role owns those pipelines end-to-end: improving accuracy, hardening reliability, and extending them to new use cases. The work sits at the intersection of AI API orchestration, image processing, and production Python backend engineering. You'll be building systems that combine frontier multimodal AI APIs with deterministic decoders to produce auditable, accurate results that clinical workflows depend on. This is not a research role — the systems you build have direct patient safety implications, and getting it right matters. One important note on scope: this role does not involve training or fine-tuning models, MLOps infrastructure, or classical ML experimentation. If your background is in building production systems that orchestrate AI APIs and extract structured data reliably — rather than training the models themselves — this is a strong fit.

Requirements

Production experience building systems on top of multimodal LLM APIs — effective structured-output prompts, schema validation, retry handling, and fallback design
Comfort with image preprocessing techniques: contrast normalization, thresholding, rotation, compression
Experience with machine-readable code decoding (1D/2D barcodes, QR codes, or similar) and the preprocessing strategies that improve success rates
Strong async Python: FastAPI, Pydantic v2, asyncpg, PostgreSQL
Reliability-first mindset — you build pipelines that produce auditable output even when individual stages fail

Nice To Haves

Experience with open-vocabulary or zero-shot object detection as a pipeline component
OCR or document understanding pipelines applied to structured data extraction
Durable workflow orchestration experience (Temporal, Prefect, Airflow, or similar)

Responsibilities

Build and improve multi-stage CV pipelines spanning object detection, multimodal LLM extraction, machine-readable code decoding, and multi-source reconciliation
Own pipeline accuracy — instrument field-level metrics, diagnose failure modes, and drive improvements through prompt engineering, preprocessing strategy, and reconciliation logic
Write and maintain structured prompting protocols for multimodal models, including systematic extraction sequences, confidence calibration, and graceful handling of ambiguous inputs
Design persistence schemas and audit data models that make every extraction independently reviewable
Maintain and extend the async Python backend services that surface pipeline results to downstream clinical workflows

Benefits

Incentive Stock Options proportionate to your salary
Fully remote — we're a distributed team across multiple time zones
Unlimited PTO
Top-tier health, vision, and dental benefits
The opportunity to build on a strong foundational team with deep data and engineering roots at a stage where your work genuinely shapes the product

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume