Data Science - AI Document Understanding, Co-op

Ancestry

1d•Hybrid

About The Position

Ancestry is seeking an exceptional and highly motivated AI Engineer / Data Science Co-op to join our AI Applied Science Content team. You’ll play a vital role in the design and implementation of AI Native agentic systems that extract and organize text and image information from billions of historical and genealogical records, enabling customers to discover, share, and connect with their family history. The work will focus on building autonomous, multi-agent workflows capable of complex reasoning, tool use, analysis, and self-correction. You will also work closely with engineering teams to train, optimize, and deploy solutions that promote product development, customer success, and content creation across our Family History business. This is a part-time, work-study-based opportunity designed for active master's and PhD students continuing their education in the fall.

Requirements

Currently pursuing an advanced degree (Master's or PhD preferred) in Computer Science, Data Science, Statistics, Mathematics, Linguistics, Engineering or related quantitative field with a strong data focus.
Specialization in AI & LLMs including familiarity with foundational models such as GPT, Gemini, Qwen, Llama, Claude, etc.
Experience with inference optimization, vLLM, LoRA, QLoRA, quantization, etc.
Familiar with embeddings, vector databases, transformer models, with software development experience.
Strong proficiency in Python and relevant tools and libraries, including transformer models, multi-modal models, and general NLP (e.g., Hugging Face Transformers, agentic frameworks and workflows, LangChain, LangGraph, CrewAI, AgentCore).

Nice To Haves

Familiarity with cloud platforms and related AI/ML services such as Google Cloud Platform, GCP, Gemini API, Vertex AI, AWS EC2, S3, SageMaker, Model Registry, and Bedrock is a plus.

Responsibilities

Implement cutting-edge AI solutions for key Document Understanding tasks such as OCR/HTR, transcription, Named Entity Recognition (NER), Relation Extraction (RE), Coreference Resolution, Summarization, and Knowledge Graphs working with diverse genealogical and historical collections spanning newspapers, city directories, family history books, and vital records (i.e., birth, marriage, & death records).
Evaluate the performance of multi-modal models in zero-shot and few-shot learning scenarios for comprehensive document understanding.
Design and implement multi-agent workflows using frameworks like LangChain, LangGraph, CrewAI, or AutoGen to automate complex multi-step reasoning tasks in historical document analysis.
Establish "LLM-as-a-Judge" frameworks and use tools like Arize Phoenix, DeepEval, or RAGAS to monitor for hallucination, drift, and bias.
Partner closely with ML Ops and Data Science Engineers to seamlessly deploy datasets, models, and pipelines in cloud environments.
Clearly and confidently present your findings, deliverables, and proposed solutions to technical and non-technical audiences, including teams, stakeholders, and executives.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Part-time

Career Level

Entry Level

Data Science - AI Document Understanding, Co-op

About The Position

Requirements

Nice To Haves

Responsibilities

What This Job Offers

Job Search Resources

Similar Data Science - AI Document Understanding, Co-op job opportunities

Tools

Templates & Examples

Resources

Comparisons

Company