AI Researcher (AI-Oriented Knowledge Systems)

GenScript/ProBio•Piscataway, NJ

52d•Onsite

About The Position

GenScript Biotech Corporation (Stock Code: 1548.HK) is a global biotechnology group. Founded in 2002, GenScript has an established global presence across North America, Europe, the Greater China, and Asia Pacific. GenScript's businesses encompass four major categories based on its leading gene synthesis technology, including operation as a Life Science CRO, enzyme and synthetic biology products, biologics development and manufacturing, and cell therapy. GenScript is committed to striving towards its vision of being the most reliable biotech company in the world to make humans and nature healthier through biotechnology.

Requirements

Master’s degree or above in Computer Science, Artificial Intelligence, Information Management, or related fields
3+ years of AI-related research or development experience with hands-on experience in knowledge graphs, RAG, or QA systems
Publications in top-tier conferences (ACL, EMNLP, SIGIR, WWW, NeurIPS, etc.) preferred
Proficient in Python with expertise in data processing and large-scale text processing techniques
Familiar with mainstream NLP frameworks (spaCy, NLTK, HuggingFace Transformers, etc.)
Experience with graph databases (Neo4j, NebulaGraph, JanusGraph, etc.)
Familiar with vector databases (Milvus, Chroma, Weaviate, FAISS, etc.)
Deep understanding of core NLP technologies: entity recognition, relation extraction, text classification, semantic similarity
Familiar with the full lifecycle of knowledge graph construction and application
Proficient in RAG technology stack with hands-on experience in retrieval optimization, re-ranking, and answer generation
Familiar with multi-modal knowledge processing (text + image + table + structured data)
Familiar with large-scale data processing technologies (Spark, Flink, Dask, etc.)
Familiar with common data formats and protocols (JSON, XML, RDF, OWL, etc.)
Ability to conduct independent technical research, owning the full process from problem definition to solution deployment
Strong literature review and summarization skills with ability to quickly absorb cutting-edge research findings
Experimental design and evaluation capabilities, able to design proper comparative and ablation studies
Strong interest in the intersection of knowledge engineering and AI, keeping up with latest domain developments
Excellent communication and collaboration skills, able to work efficiently with engineering and product teams
Systems thinking ability to approach knowledge system design from an overall architecture perspective

Nice To Haves

Prior experience in vertical domain knowledge system construction and knowledge-driven LLM application deployment (e.g., healthcare, legal, finance, technology) preferred
Experience in data governance such as data cleaning, deduplication, and standardization preferred

Responsibilities

Knowledge Extraction & Structuring: Research techniques for extracting structured knowledge from multi-source heterogeneous data (documents, web pages, databases, conversation logs), Design automated pipelines for entity recognition, relation extraction, and event detection, Develop knowledge quality assessment and cleaning mechanisms to filter noise and conflicting information, Explore LLM-assisted knowledge extraction methods, balancing automation efficiency with manual validation costs, Research incremental knowledge extraction strategies to support continuous knowledge base updates and expansion
Knowledge Organization & Representation: Design knowledge graph schemas and ontologies to build structured frameworks for domain knowledge, Research Knowledge Embedding techniques to achieve fusion of knowledge and vector spaces, Develop multi-level knowledge representation systems supporting coarse-to-fine granularity knowledge navigation, Explore knowledge fusion and alignment techniques to resolve entity disambiguation and conflict resolution from multi-source knowledge, Research knowledge version management and provenance mechanisms to ensure knowledge traceability
Knowledge Retrieval & Augmentation: Optimize RAG (Retrieval-Augmented Generation) systems to improve retrieval accuracy and answer quality, Research hybrid retrieval strategies combining vector search, keyword search, graph traversal, and other approaches, Develop retrieval re-ranking algorithms to enhance Top-K result relevance, Design retrieval-generation collaborative optimization mechanisms to reduce hallucinations and erroneous citations, Explore retrieval feedback learning to continuously optimize retrieval strategies based on user behavior
Knowledge Reasoning & Question Answering: Research knowledge graph-based reasoning techniques supporting multi-hop reasoning, logical reasoning, and causal reasoning, Develop Complex QA systems supporting multi-condition and multi-step question answering, Explore fusion methods combining LLMs with symbolic reasoning, leveraging advantages of both neural and symbolic approaches, Design interpretability frameworks for reasoning processes, supporting answer provenance and reasoning chain visualization, Research knowledge gap detection and active learning mechanisms to identify coverage blind spots in the knowledge base
Knowledge Update & Maintenance: Design knowledge timeliness management mechanisms supporting knowledge expiration detection and automatic updates, Research knowledge conflict detection and resolution strategies for handling contradictory information fusion, Develop knowledge base health monitoring systems tracking coverage, accuracy, freshness, and other metrics, Explore human-feedback-driven knowledge iteration mechanisms, Research knowledge compression and summarization techniques to optimize storage efficiency and retrieval performance

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume