AI/ML Engineer for Assessment Systems

Northeastern UniversityBoston, MA
1d

About The Position

You will develop, implement, and validate AI scoring systems that evaluate student transcripts using research-validated rubrics. This role combines natural language processing, prompt engineering, and educational assessment methodology to create scalable evaluation tools. You will work closely with the research team to ensure AI systems achieve acceptable agreement with human expert ratings.

Requirements

  • Graduate student or professional in Computer Science, Data Science, Computational Linguistics, or related field
  • Strong Python programming skills (pandas, scikit-learn, numpy)
  • Experience with large language model APIs (OpenAI GPT-4, Claude, or similar)
  • Demonstrated ability to design and implement NLP pipelines
  • Understanding of evaluation metrics (precision, recall, F1, correlation, agreement statistics)
  • Experience with version control (Git) and collaborative coding practices

Nice To Haves

  • Experience with prompt engineering and LLM optimization
  • Background in natural language processing or computational linguistics
  • Familiarity with educational assessment or psychometrics
  • Knowledge of inter-annotator agreement statistics (Cohen's kappa, Krippendorff's alpha)
  • Experience with retrieval-augmented generation (RAG) systems
  • Understanding of bias detection and fairness metrics in AI systems

Responsibilities

  • Design AI scoring architecture combining rule-based and LLM components
  • Develop prompt engineering strategies for rubric-aligned evaluation
  • Implement quantitative indicator scoring (frequency counts, behavioral thresholds)
  • Create LLM-based qualitative judgment systems for complex indicators
  • Build data processing pipelines for transcript preparation and analysis
  • Compare AI ratings against human expert ground truth (n=200+ transcripts)
  • Calculate agreement statistics (correlation, Cohen's kappa, confusion matrices)
  • Identify systematic rating discrepancies and refine prompts/algorithms
  • Implement confidence scoring to flag uncertain AI judgments
  • Conduct bias audits across demographic subgroups (gender, race/ethnicity)
  • Deploy validated scoring system for 300+ student transcripts
  • Create technical documentation for scoring methodology
  • Develop user-facing explanations of AI ratings for educators

Benefits

  • Northeastern has a comprehensive benefits package for benefit eligible employees. This includes medical, vision, dental, paid time off, tuition assistance, wellness & life, retirement- as well as commuting & transportation.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service