About The Position

The Amazon Artificial General Intelligence (AGI) Data Services organization is looking for a Language Engineer with experience in dataset construction, linguistic annotation, dialog/semantic schemas, and automatic processing of large datasets. You will play a critical role in driving innovation and advancing the state-of-the-art in natural language processing and machine learning. You will work closely with cross-functional teams, including product managers, engineers, and data scientists to ensure that our AI systems are aligned with human policies and preferences.

Requirements

  • Experience owning and executing language data collection projects, including guidelines, labelset and annotation workflow development
  • Master's or higher degree in a relevant field (Computational Linguistics or equivalent field with computational analysis)
  • 2+ years experience in computational linguistics or language data processing or AI data creation
  • Experience with language data annotation systems and other forms of data markup
  • Proficient with scripting languages, such as Python
  • Experience working with speech, text, and multimodal data in multiple languages
  • Excellent communication, strong organizational skills and very detailed oriented
  • Comfortable working in a fast paced, highly collaborative, dynamic work environment

Nice To Haves

  • PhD in Computational Linguistics (or equivalent field with computational emphasis)
  • Expertise in bootstrapping AI data collections for quickly evolving requirements
  • Extensive experience working with speech, text, and multimodal data in multiple languages
  • Experience in data creation for complex agentic workflows
  • Practical experience with Machine Learning and technical concepts such as API
  • Practical knowledge of version control and agile development; familiarity with database queries and data analysis processes (SQL, R, Matlab, etc.)

Responsibilities

  • Design data collection/creation tasks in response to science needs: author instructions, define and implement quality targets and mechanisms, provide day-to-day coordination of data collection efforts (including planning, scheduling, and reporting), and be responsible for the final deliverables
  • Analyze and extract language-related insights from large amounts of data
  • Build tools or tool prototypes for data analysis or data authoring, using Python or another scripting language
  • Use modeling tools to bootstrap or test new functionalities
  • Collaborate with scientists and software engineers to evaluate performance of language models
  • Handle competing requests from a range of data customers

Benefits

  • health insurance (medical, dental, vision, prescription, Basic Life & AD&D insurance and option for Supplemental life plans, EAP, Mental Health Support, Medical Advice Line, Flexible Spending Accounts, Adoption and Surrogacy Reimbursement coverage)
  • 401(k) matching
  • paid time off
  • parental leave
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service