Principal Research Engineer, Post-Training

Character.AI•Redwood City, CA

2d•$275,000 - $400,000

About The Position

As a Principal Research Engineer on the Post-Training team, you will drive the technical vision, execution, and evolution of the systems that transform foundation models into intelligent, engaging, and aligned products. Specifically, your team focuses on post-training of top-tier OSS LLMs (such as Mistral and Qwen) to power the highly immersive role-playing chat features of Character.AI. You will lead initiatives spanning data, algorithms, infrastructure, and evaluation, helping define how our models learn from feedback and improve over time. This is a highly cross-functional role that combines deep technical expertise with organizational leadership. You will partner closely with researchers, engineers, product teams, and infrastructure teams to identify the highest-leverage opportunities for improving model performance and user experience. Your work will directly shape the conversational experiences of millions of users every day. At Character.AI, you will have the opportunity to influence both the direction of our research and the systems that bring it into production, helping build the next generation of AI entertainment.

Requirements

PhD in Computer Science, Machine Learning, AI, or a related field, or equivalent industry experience.
Significant experience leading technical projects or teams in machine learning, AI research, or large-scale distributed systems.
Experience scaling and mentoring high-performing research and engineering teams.
Deep understanding of modern machine learning techniques, including transformers, reinforcement learning, alignment methods, and large language models.
Strong track record of delivering impactful research or applied ML systems in production environments.
Expertise in designing, building, and maintaining production-quality ML systems and infrastructure.
Experience training, serving, debugging, and optimizing large-scale models on GPU-based systems.
Experience leading teams working on large language model training, mid-training, or post-training.
Experience with product experimentation, online evaluation, and A/B testing frameworks.
Strong software engineering skills with the ability to write clean, maintainable, and scalable code.
Excellent communication skills and the ability to influence technical direction across teams.
Lead complex, cross-functional initiatives across data, training infrastructure, evaluation, and model serving.

Nice To Haves

Hands-on experience working directly with open-source models like Mistral and Qwen, particularly adapting them via mid- and post-training for specific personas, creative writing, or role-playing applications.
Familiarity with cloud-native ML infrastructure, including Kubernetes, Docker, and modern orchestration platforms.
Publications in leading machine learning conferences or demonstrated contributions to the broader AI community.

Responsibilities

Define and drive the technical roadmap for mid- and post-training systems, balancing research innovation with production reliability and scalability.
Mentor and grow a team of researchers and engineers through technical guidance, design reviews, and career development.
Establish best practices for experimentation, model development, and deployment.
Lead the development of alignment algorithms, optimization techniques, and training objectives to improve model capabilities and data efficiency.
Drive advances in mid- and post-training methodologies including reinforcement learning, preference optimization, supervised fine-tuning, and emerging alignment approaches.
Identify and execute high-impact research opportunities that improve model behavior, safety, and user engagement.
Develop robust evaluation frameworks and quality signals to measure real-world model performance.
Lead the design of efficient training and inference systems for large-scale generative models.
Architect scalable data pipelines that transform diverse data sources into high-quality training datasets.
Partner with infrastructure teams to optimize distributed training, GPU utilization, and serving efficiency.
Drive improvements in experimentation platforms, data quality systems, and model observability.

Benefits

diversity and welcome applicants from all backgrounds
equal opportunity employer
non-discrimination policy based on race, religion, national origin, gender, sexual orientation, age, veteran status, or disability

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume