Research Intern, Special Projects

Higharc

1d•Remote

About The Position

Higharc is a VC-backed startup that is changing how new homes are designed and built. Join a founding team who’ve shipped products for Autodesk, Electronic Arts, Nike, and Apple. We have raised a total of $83M with support from top-notch venture capital firms and more than 18 strategic investors—industry leaders in construction, building products manufacturing, and distribution. Higharc is seeking three PhD Research Interns to join our Special Projects team for a 12-week engagement (flexible start and end dates, with the possibility of extension based on performance and progress). This is a fully remote role open to applicants based in the United States. You'll work directly with Higharc's Special Projects team across the full research lifecycle — from problem formulation and dataset strategy through training, evaluation, and publication-quality write-ups. The primary research areas are Vision-Language Models for architectural drawings (visually grounded retrieval, grounded document QA, and visual-to-structured-output) and semi-supervised learning for instance segmentation. The key deliverable is a publishable research contribution alongside reproducible code, ready for integration with Higharc ML products.

Requirements

Active enrollment in a PhD program in Computer Science, Machine Learning, or a related field at a U.S. institution.
Strong Python programming skills and experience building deep learning training workflows (PyTorch preferred).
Solid understanding of computer vision, transformers, representation learning, and ML experimentation practices.
Demonstrated research ability through publications, strong preprints, open-source research code, or equivalent evidence of research impact.
Comfort working in conventional research and engineering stacks (RoboFlow, Modal, WandB, or similar).
You have strong research instincts, the ability to work independently on open-ended problems, and a track record of producing rigorous, reproducible work. You're comfortable moving quickly without sacrificing research hygiene.

Nice To Haves

Experience with vision-language models or multimodal foundation models.
Experience designing and deploying semi-supervised learning methods (pseudo-labeling, self-training, distillation, consistency regularization) and familiarity with common failure modes such as confirmation bias, noisy pseudo-labels, and calibration drift.
Experience with multi-GPU training and large-scale experimentation.
Familiarity with AEC data and workflows — CAD/BIM concepts, plan understanding, or domain-specific labeling.

Responsibilities

Build data pipelines and extract data from Higharc's existing datasets.
Design, implement, and execute semi-supervised and weakly-supervised VLM and segmentation training pipelines.
Develop evaluation suites and error taxonomies for targeted multimodal tasks.
Run rigorous ablations and scaling experiments, track results, and maintain reproducibility and research hygiene throughout.
Document findings and present results through technical reports, demos, and a submission-ready draft.

Benefits

Higharc offers competitive salaries with significant equity, in a fast-growing, well-funded company.
Personal healthiness is an important value for us- we provide comprehensive medical, dental, and vision coverage, with unlimited PTO, and meaningful maternity/paternity leave to all U.S based employees that are full-time.
You'll also have access to other big-company benefits such like short and long-term disability plans and a 401K.
Haven't worked remotely before? We provide a stipend to create the ideal home office.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume