Member of Technical Staff, ML Evaluation

Cognita Imaging•Palo Alto, CA

23h•Onsite

About The Position

Cognita’s mission is to increase the world’s access to healthcare. Radiology is (1) the first-line diagnostic specialty, (2) facing a worsening global workforce shortage, and (3) highly digitized, making it uniquely positioned for AI to have an enormous impact. Stage one of Cognita is focused on expanding access to radiology at scale. Our founding team met at Stanford, where they laid the groundwork for applying comprehensive AI to radiology. Building on that foundation, Cognita develops vision-language models that read radiology studies the way radiologists do - interpreting the full study in clinical context - and generate draft results that make radiologists more efficient and accurate. In partnership with Radiology Partners, Cognita’s models are trained and validated on one of the world’s largest real-world radiology datasets. As a Member of Technical Staff focused on ML Evaluation, you will be responsible for understanding how well Cognita’s radiology models perform in real-world settings. Your work will focus on evaluating complex multimodal models, identifying failure modes, and designing investigations that reveal where and why models succeed or fall short.

Requirements

Experience evaluating complex machine learning models, including identifying strengths, weaknesses, and failure modes.
Strong analytical judgment and the ability to design focused investigations into model behavior.
Comfort working in ambiguous environments where the right evaluation approach is not obvious upfront.
Ability to collaborate closely with ML engineers and researchers to translate evaluation findings into action.
Clear written and verbal communication skills.

Nice To Haves

Biomedical, medical imaging, or healthcare-related experience.
Familiarity with radiology or healthcare workflows.
Experience evaluating large language models or multimodal systems.

Responsibilities

Evaluate large, complex ML models across a wide range of real-world cases and clinical scenarios.
Design investigations to understand model behavior beyond standard metrics.
Analyze model outputs to identify systematic errors and blind spots.
Develop evaluation frameworks and methodologies for multimodal and generative models.
Partner with ML training and infrastructure teams to guide model improvement.
Work with clinicians to incorporate domain context where needed and validate findings.
Communicate clearly about model limitations, risks, and readiness for deployment.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume