This role involves training large language models (LLMs) to generate production-grade code across various programming languages. The work includes comparing and ranking multiple code snippets, explaining which is best and why, repairing and refactoring AI-generated code for correctness, efficiency, and style, and injecting feedback (ratings, edits, test results) into the RLHF (Reinforcement Learning from Human Feedback) pipeline to ensure its smooth operation. The ultimate goal is to teach the model to propose, critique, and improve code in a manner similar to an expert engineer.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Part-time
Career Level
Mid Level
Education Level
No Education Listed
Number of Employees
11-50 employees