Senior Research Software Engineer

Harvard UniversityCambridge, MA
Hybrid

About The Position

The Harvard Data Science Initiative (HDSI) is hiring a Senior Research Software Engineer (RSE) to support a portfolio of faculty-led research projects under the HDSI–AWS Impact Computing Alliance. This role is designed for an engineer who thrives in research settings and enjoys translating scientific goals into robust, efficient, and reproducible AI/ML systems. Rather than being tied to a single lab, the RSE will provide shared, cross-project engineering support—helping multiple teams accelerate discovery by building and optimizing machine learning infrastructure, improving performance on modern hardware (including AI accelerators), and enabling scalable execution in AWS and HPC environments. Projects may span domains such as climate and environmental science, global health, and other areas aligned with the alliance’s mission to deliver measurable social and environmental impact. This is a hands-on role with strong collaboration expectations: you’ll work directly with researchers, HDSI technical leadership, and the alliance team to deliver production-grade research software and reusable technical patterns that benefit multiple projects across the Impact Computing umbrella. This position is a benefits-eligible, two-year term appointment through June 30, 2028.

Requirements

  • Minimum of seven years’ post-secondary education or relevant work experience
  • BS or MS (or equivalent practical experience) in Computer Science, Computer Engineering, Data Science, or a closely related field
  • Strong programming skills in Python/C/C++
  • Experience working with ML frameworks such as PyTorch, TensorFlow, JAX, XLA, Triton, ONNX, Caffe2, or TensorRT
  • Proven experience in deep learning at scale, familiarity with the “alphabet soup” of distributed computing (DP, TP, SP, CP, EP)
  • Experience with production environments, including Git-based workflows
  • Experience working in AWS cloud or HPC environments used for large-scale computation
  • Prior experience in a research or research-adjacent environment, with an understanding of the scientific software lifecycle
  • Strong communication skills and a collaborative working style

Nice To Haves

  • Contributed to compiler infrastructures and optimization frameworks (MLIR, LLVM, XLA, TVM, IREE, Halide)
  • Experience developing or optimizing high-performance with libraries or kernels (e.g., cuBLAS, cuDNN, CUTLASS, HIP, ROCm, or similar)
  • Experience with distributed AI/ML training and performance optimization (e.g., PyTorch DDP, FSDP, DeepSpeed)
  • Experience building tooling for runtime analysis, profiling, and performance diagnostics
  • Experience with secure or privacy-constrained data environments (e.g., HIPAA-aware engineering practices)
  • Experience working in interdisciplinary research areas such as climate, environment, health, or astrophysics
  • Completion of Harvard IT Academy specified foundational courses (or external equivalent) preferred

Responsibilities

  • Design, build, and maintain ML/AI systems and research software in Python and C/C++
  • Develop and optimize machine learning training and inference pipelines for accelerator-based systems
  • Apply systems- and compiler-level optimizations, including: Loop transformations, vectorization, parallelization, and hardware-specific tuning (e.g., SIMD)
  • Implement and optimize kernels using CUDA, OpenMP, OpenCL, or accelerator-specific programming models
  • Contribute to or integrate with compiler and IR frameworks such as MLIR, LLVM, XLA, IREE, TVM, or Halide
  • Analyze and improve performance using profiling and diagnostics focused on: Latency, memory bandwidth, I/O throughput, and compute utilization
  • Support execution in AWS cloud and HPC environments, including large-scale model training, profiling, debugging, scaling, cost/performance tuning, reliability, CI/testing, packaging, deployment, reproducibility engineering
  • Follow and promote modern ML and scientific software best practices: Experiment tracking, reproducibility, version control, testing, packaging, and documentation
  • Collaborate closely with faculty, researchers, and AWS consulting partners on systems engineering, performance optimization, ML infrastructure, compilers/frameworks integration, cloud/HPC execution.
  • Communicate technical findings, tradeoffs, and progress clearly to research stakeholders (including documentation and handoff-ready tooling)

Benefits

  • Generous paid time off including parental leave
  • Medical, dental, and vision health insurance coverage starting on day one
  • Retirement plans with university contributions
  • Wellbeing and mental health resources
  • Support for families and caregivers
  • Professional development opportunities including tuition assistance and reimbursement
  • Commuter benefits, discounts and campus perks
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service