About The Position

When you join Ancestry, you join a human-centered company where every person’s story is important. Ancestry®, the global leader in family history, connects everyone with their past so they can discover, preserve, and share their unique family stories. With our unparalleled collection of more than 65 billion records, over 3.5 million subscribers, and over 27 million people in our growing DNA network, customers can discover their family story and gain a new level of understanding about their lives. Over the past 40 years, we’ve built trusted relationships with millions of people who have chosen us as the platform for discovering, preserving, and sharing the most important information about themselves and their families. We are committed to our location flexible work approach, allowing you to choose to work in the nearest office, from your home, or a hybrid of both (subject to location restrictions and roles that are required to be in the office- see the full list of eligible US locations HERE). We will continue to hire and promote beyond the boundaries of our office locations, to enable broadened possibilities for employee diversity. Together, we work every day to foster a work environment that's inclusive as well as diverse, and where our people can be themselves. Every idea and perspective is valued so that our products and services reflect the global and diverse clients we serve. Ancestry encourages applications from minorities, women, the disabled, protected veterans and all other qualified applicants. Passionate about dedicating your work to enriching people’s lives? Join the curious. Ancestry is seeking an exceptional and highly motivated Agentic AI, Document Understanding Co-op to join our AI Applied Science Content team. You’ll play a vital role in the design and implementation of AI Native agentic systems that extract and organize text and image information from billions of historical and genealogical records enabling customers to discover, share, and connect with their family history. The work will focus on building autonomous, multi-agent workflows capable of complex reasoning, tool use, analysis, and self-correction. You will also work closely with engineering teams to train, optimize, and deploy solutions that promote product development, customer success, and content creation across our Family History business. This is a part-time, work-study-based opportunity for students in active master's or PhD programs in 2026.

Requirements

  • Currently pursuing an advanced degree (Master's or PhD preferred) in Computer Science, Data Science, Statistics, Mathematics, Linguistics, Engineering or related quantitative field with a strong data focus.
  • Specialization in AI & LLMs including familiarity with foundational models such as GPT, Gemini, Qwen, Llama, Claude, etc.
  • Experience with inference optimization, vLLM, LoRA, QLoRA, quantization, etc.
  • Familiar with embeddings, vector databases, transformer models, with software development experience.
  • Strong proficiency in Python and relevant tools and libraries, including transformer models, multi-modal models, and general NLP (e.g., Hugging Face Transformers, agentic frameworks and workflows, LangChain, LangGraph, CrewAI, AgentCore).

Nice To Haves

  • Familiarity with cloud platforms and related AI/ML services such as Google Cloud Platform, GCP, Gemini API, Vertex AI, AWS EC2, S3, SageMaker, Model Registry, and Bedrock is a plus.

Responsibilities

  • Innovate with State-of-the-Art AI: Implement cutting-edge AI solutions for key Document Understanding tasks such as OCR/HTR, transcription, Named Entity Recognition (NER), Relation Extraction (RE), Coreference Resolution, Summarization, and Knowledge Graphs working with diverse genealogical and historical collections spanning newspapers, city directories, family history books, and vital records (i.e., birth, marriage, & death records).
  • Analyze and Optimize Multi-Modal Models: Evaluate the performance of multi-modal models in zero-shot and few-shot learning scenarios for comprehensive document understanding.
  • Architect Agentic Systems: Design and implement multi-agent workflows using frameworks like LangChain, LangGraph, CrewAI, or AutoGen to automate complex multi-step reasoning tasks in historical document analysis.
  • Evaluation & Observability: Establish "LLM-as-a-Judge" frameworks and use tools like Arize Phoenix, DeepEval, or RAGAS to monitor for hallucination, drift, and bias.
  • Collaborate on Cloud Deployment: Partner closely with ML Ops and Data Science Engineers to seamlessly deploy datasets, models, and pipelines in cloud environments.
  • Communicate Insights Effectively: Clearly and confidently present your findings, deliverables, and proposed solutions to technical and non-technical audiences, including teams, stakeholders, and executives.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service