Mindverse Consulting Services-posted about 16 hours ago
Part-time • Mid Level
Remote • San Jose, CA
1-10 employees

Customer is one of the world’s fastest-growing AI companies accelerating the advancement and deployment of powerful AI systems. They help customers in two ways: Working with the world’s leading AI labs to advance frontier model capabilities in thinking, reasoning, coding, agentic behavior, multimodality, multilinguality, STEM and frontier knowledge; and leveraging that work to build real-world AI systems that solve mission-critical priorities for companies. As a Software Engineering evaluator, you will create cutting-edge datasets for training, benchmarking, and advancing large language models, collaborating closely with researchers. This includes curating code examples, providing precise solutions, and making corrections in Python, JavaScript (including ReactJS), C/C++, Java, Rust, and Go; evaluating and refining AI-generated code for efficiency, scalability, and reliability; and working with cross-functional teams to enhance enterprise-level AI-driven coding solutions.

  • Working on AI model training initiatives by curating code examples, building solutions, and correcting code in Python, JavaScript (including ReactJS), C/C++, Java, Rust, and Go.
  • Evaluate and refine AI-generated code to ensure that it is efficient, scalable, and reliable.
  • Collaborate with cross-functional teams to enhance AI-driven coding solutions against industry performance benchmarks.
  • Build agents that can verify the quality of the code and identify error patterns.
  • Hypothesize on steps in the software engineering cycle (prototyping, architecture design, API design, production implementation, launch, experiments, monitoring, operational maintenance) and evaluate model capabilities on them
  • Design verification mechanisms that can automatically verify a solution to a software engineering task.
  • Several years of software engineering experience (+5 years), including2+years of continuous full-time experience at a top-tier product company (e.g., Google, Stripe, Amazon, Apple, Meta, Netflix, Microsoft, Datadog, Dropbox, Shopify, PayPal, IBM Research).
  • Strong expertise in building full-stack applications and deploying scalable, production-grade software using modern languages and tools.
  • Deep understanding of software architecture, design, development, debugging, and code quality/review assessment.
  • Excellent oral and written communication skills for clear, structured evaluation rationales.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service