Italian Data Labeling Associate - California based

Welo Data•San Francisco, CA

21h•Onsite

About The Position

Welo Data is seeking Data Labeling Associates for Project Perseus. This role is for individuals with professional-level proficiency in Italian and a background in writing, with attention to AI safety. It is not a traditional annotation role, but rather involves working at the intersection of human judgment and machine learning, evaluating how cutting-edge AI systems handle Arabic language nuances, cultural context, and safety boundaries. The mission is to achieve human-in-the-loop excellence by critiquing Arabic AI outputs for accuracy, fluency, and tone, providing structured feedback to drive model improvement. Associates will also engage in safety engineering, using their 2+ years of GenAI safety experience to identify bias, risks, and failure modes in Arabic datasets. They will apply complex linguistic guidelines thoughtfully and help refine evaluation frameworks to reflect real-world Arabic dialects and contexts. Strategic insights involve identifying trends in model performance and communicating findings clearly to technical stakeholders. Collaborative calibration includes participating in team syncs to ensure consistent quality across global AI workstreams. This is a W2 Full-Time Employee position, 40 hours/week, located on-site in California (San Francisco, Sunnyvale, or Burlingame) with a pay rate of $34 per hour. This opportunity allows individuals to move beyond traditional data work and play a direct role in how AI systems are evaluated and improved, in a fast-moving, collaborative, and increasingly central environment.

Requirements

Professional-level proficiency in Italian and a bachelor’s degree or higher.
B2 level or superior English skills to navigate technical guidelines and stakeholder syncs.
1–2 years of experience in professional writing, journalism, or content creation.
2+ years of experience focusing on Safety & Trust in Generative AI data delivery.
The ability to make sound judgment calls in ambiguous "grey area" scenarios.
A natural curiosity about LLMs and the confidence to flag issues when a model's behavior feels "off."

Responsibilities

Critique Arabic AI outputs for accuracy, fluency, and tone, providing structured feedback to drive model improvement.
Use your 2+ years of GenAI safety experience to identify bias, risks, and failure modes in Arabic datasets.
Apply complex linguistic guidelines thoughtfully and help refine evaluation frameworks to reflect real-world Arabic dialects and contexts.
Identify trends in model performance and communicate findings clearly to technical stakeholders.
Participate in team syncs to ensure consistent quality across global AI workstreams.

Benefits

Free breakfast, lunch, and dinner with a wide variety of cuisines.
Micro-kitchens stocked with premium coffee, beverages, and healthy treats.
Modern collaborative spaces and unique amenities like rooftop nature parks (at select locations).
Comprehensive Medical, Dental, and Vision.
401(k) and HSA eligibility.
Free transport, shuttles, and bike-to-work perks.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume