Data Annotation Associate

Protege
1dRemote

About The Position

We are building Protege to solve the biggest unmet need in AI — getting access to the right training data. The process today is time intensive, incredibly expensive, and often ends in failure. The Protege platform facilitates the secure, efficient, and privacy-centric exchange of AI training data. Solving AI’s data problem is a generational opportunity. We’re backed by world-class investors and already powering partnerships with some of the most ambitious teams in AI. The company that succeeds will be one of the largest in AI — and in tech. We’re a lean, fast-moving, high-trust team of builders who are obsessed with velocity and impact. Our culture is built for people who thrive on ambiguity, own outcomes, and want to shape the future of data and AI. The Data Annotation & Redaction Associate (PHI) supports Protege’s core data operations by helping prepare sensitive healthcare documents for AI training workflows. This is a fully remote, W2 role based anywhere in the United States, and will involve handling Protected Health Information (PHI) under strict security and confidentiality requirements. This is a 2-month position with the possibility of extending into a permanent hire based on business needs and performance. Your first major project will be de-identifying thousands of PDFs by redacting HIPAA identifiers (e.g., names, locations, ages, dates, contact information, record numbers) according to a clear playbook and review process. Accuracy, consistency, and speed matter. Full-time (40 hrs/week) is preferred, but hours are flexible and we will consider part-time applicants.

Requirements

  • Authorized to work in the U.S. and able to work as a W2 employee based anywhere in the United States (required for PHI access)
  • Comfort handling sensitive information and following strict privacy/security rules
  • Experience with detail-oriented work (administrative operations, document review, medical records handling, QA, compliance support, or data labeling/annotation)
  • Comfort working with PDFs and basic productivity tools (Google Workspace / Microsoft Office)
  • Strong written communication and reliable follow-through
  • Ability to maintain speed and accuracy across large volumes of similar documents
  • You have excellent attention to detail and can do focused, repetitive work without accuracy drift
  • You’re dependable and consistent—show up, hit your daily targets, and follow process
  • You learn rules quickly and apply them consistently
  • You are comfortable asking questions when something is unclear rather than guessing
  • You work well independently in a fully remote environment
  • You treat those around you with kindness

Nice To Haves

  • Prior experience in HIPAA-regulated environments or working with healthcare documents (hhs.gov)
  • Experience redacting or reviewing documents (legal, healthcare, insurance, or compliance contexts)
  • Experience in data annotation/labeling workflows
  • Comfort tracking work in spreadsheets and following simple metrics (throughput, error rate)

Responsibilities

  • De-identify high volumes of healthcare PDFs by accurately redacting PHI identifiers (names, locations, dates, ages, IDs, and other identifiers) in accordance with established guidelines (hhs.gov)
  • Follow a redaction/annotation playbook closely, including how to handle edge cases and when to escalate questions
  • Complete light QA on your own work (spot checks, verify redactions applied correctly, ensure no PHI remains visible/searchable)
  • Track daily throughput and communicate status clearly (what’s done, what’s blocked, what needs review)
  • Maintain organized file handling and versioning so work is easy to audit and review
  • Operate within strict security policies for PHI handling (confidentiality, access controls, and device hygiene)
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service