Senior AI Data Engineer

iManageToronto, ON
$120,000 - $170,000Hybrid

About The Position

Being a Senior AI Data Engineer at iManage Means… You’re passionate about transforming unstructured text into meaningful insights that power AI and machine learning solutions. You thrive at the intersection of data engineering, AI and natural language processing, building the pipelines and datasets that fuel generative AI applications, agentic systems, advanced model fine tuning and other NLP-driven capabilities across iManage. As a Senior AI Data Engineer on the Applied AI team, you will design, build, and optimize large-scale text data pipelines that power AI/ML and Generative AI solutions for our customers. You’ll work with knowledge engineering, applied AI, and product teams to prepare, enrich, and integrate document data. Your work will be essential to enabling intelligent, AI-powered features across the iManage platform.

Requirements

  • A Bachelor’s degree or higher in Computer Science, Data Engineering, Applied Mathematics, Computational Linguistics, or a quantitative related field.
  • 4+ years of data engineering experience, with at least 2 years working with unstructured data in a business setting.
  • Strong proficiency in Python, PySpark, and data manipulation for large unstructured text datasets.
  • Strong understanding of NLP concepts such as tokenization, embeddings, semantic search, and experience with standard text libraries such as SpaCy, HuggingFace Datasets, NLTK.
  • Solid dataOps knowledge and experience orchestrating advanced NLP data pipelines using cloud based data infrastructure
  • Proficiency with Git and collaborative development frameworks
  • A passion for enabling AI capabilities through scalable, reliable data architecture.
  • Problem solving, creativity, curiosity, and a collaborative mindset.

Nice To Haves

  • Exposure to Microsoft Azure Services such as Fabric, ADLS, AI Foundry, Azure ML, MLflow
  • Experience with knowledge graph implementation for NLP applications
  • Experience working with data for the legal domain
  • Experience designing architectures for large-scale text corpora

Responsibilities

  • Designing, developing and maintaining scalable pipelines in MSFT Azure to ingest and transform large volumes of text data from multiple sources
  • Designing automated workflows for text normalization, deduplication, language identification, PII redaction and metadata enrichment
  • Building automated data validation processes to ensure accuracy and consistency
  • Supporting model fine-tuning, semantic search and Gen AI evaluations tuning through dataset curation, prompt dataset preparation, labeling coordination, and text quality validation
  • Partnering with the Applied AI team to gather data requirements and build data interfaces for developing and maintaining machine learning systems
  • Maintaining data lineage and following data privacy, security and governance best practices
  • Implementing data versioning and lineage tracking for machine learning experiments

Benefits

  • Flexible working policy
  • Market competitive salary
  • Annual performance-based bonus
  • Comprehensive Health/Vision/Dental/Life Insurance
  • Registered Retirement Savings Plan with a company match up to 5%
  • Enhanced leave for expecting parents (20 weeks 100% paid for primary leave, and 10 weeks 100% paid for secondary leave)
  • Flexible time off policy
  • Multiple company wellness days each year
  • Access to RethinkCare, a global behavioral health platform
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service