Data Labeling Associate

WelocalizeSan Francisco, CA

About The Position

The ideal candidate will have a foundational understanding of machine learning, data annotation, quality assurance, and natural language processing. They will play a pivotal role in updating our machine learning models and ensuring their efficacy. This role primarily focuses on English US data sets; however, familiarity with translation or multi-lingual data sets can be a plus for future projects. Welocalize is a leading technology-enabled provider of translation, localization, and AI-driven content solutions, helping businesses communicate, innovate, and grow globally. Specializing in complex and regulated industries, Welocalize delivers precise, scalable multilingual content through a powerful combination of advanced AI technologies and expert human talent. At the core is Welocalize’s AI-enabled OPAL platform, which transforms translation workflows by integrating machine translation (MT) and large language models (LLMs) to provide fast, accurate, and culturally relevant content in over 300 languages. With a commitment to excellence, Welocalize holds 7 ISO certifications. Welocalize is headquartered in New York with offices all over the globe.

Requirements

  • Foundational understanding of machine learning, data annotation, quality assurance, and natural language processing.
  • Ability to work in a fast-paced, collaborative environment.
  • Excellent communication skills.
  • Familiarity with command-line tools and interfaces.
  • Strong analytical skills with the ability to identify patterns and anomalies.

Nice To Haves

  • Familiarity with translation or multi-lingual data sets can be a plus for future projects.

Responsibilities

  • Update training and test model databases with new or amended synthetic textual and image data.
  • Modify and refine machine learning data creation, annotation, and rating guidelines.
  • Initiate model training processes using internal tools and command-line interfaces.
  • Evaluate the performance of trained models to gauge their efficacy and readiness for deployment.
  • Design and develop test and training datasets as per the criteria provided by the project manager and other full-time employees.
  • Handle data efficiently, ensuring its integrity throughout the workflow.
  • Engage in data relevance tasks, ensuring data sets are aligned with project goals.
  • Annotate data accurately, ensuring it adheres to set guidelines.
  • Conduct manual quality analysis of model results.
  • Recognize error patterns and report anomalies for further investigation.
  • Deliver detailed reports on findings, including aspects such as utterance quality, LLM evaluation, ASR bug tracking, and customer pain points to be reviewed by the User Experience Research team.
  • Implement basic quality control measures and ensure the reliability of processed data.
  • Utilize intermediate data analysis techniques to extract insights and inform decision-making.
  • Arbitrate discrepancies effectively, ensuring consistent data quality.
  • Apply basic knowledge of natural language processing and linguistics to data processing tasks.
  • Ensure linguistic accuracy in all processed and annotated data.
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service