Data Architect - Montreal, Canada

Human Agency

15h•Remote

About The Position

We’re seeking a Data Architect to design modern, AI-ready data architectures across multiple client engagements. This role sits at the intersection of data modeling, semantic layer design, feature engineering, and AI enablement. You’ll architect systems that make data reliable, reusable, and production-ready for business intelligence, machine learning, and artificial intelligence. You should be equally comfortable designing the data backbone for AI-driven products, writing SQL or Python to unblock a model pipeline, or guiding teams through tradeoffs between flexibility, cost, and responsible automation.

Requirements

7+ years in data engineering/analytics engineering with ownership of production pipelines and BI at scale.
Demonstrated success owning and stabilizing production data platforms and critical pipelines.
Strong grasp of modern data platforms (e.g., Snowflake), orchestration (Airflow), and transformation frameworks (dbt or equivalent).
Competence with data integration (ELT/ETL), APIs, cloud storage, and SQL performance tuning.
Practical data reliability experience: observability, lineage, testing, and change management.
Operates effectively in ambiguous, partially documented environments; creates order quickly through documentation and standards.
Prior ownership of core operations and reliability for business-critical pipelines with defined SLOs and incident response.
Demonstrated client-facing experience (consulting/agency or internal platform teams with cross-functional stakeholders) and outstanding written/verbal communication (executive briefings, workshops, decision memos).
Bachelor’s degree or equivalent experience.
Commitment to ethical practices and responsible AI.

Nice To Haves

Deep interest in Generative AI and Machine Learning.
Basic scripting ability in Python.
Practical Generative AI experience: shipped at least one end-to-end workflow (e.g., RAG) including ingestion, embeddings, retrieval, generation, and evaluation.
Working knowledge of LLM behavior (tokens, context windows, temperature/top-p, few-shot/tool use) and how to tune for quality/cost/latency.
Comfort with vector search (e.g., pgvector or a hosted vector store) and hybrid retrieval patterns.
Evaluation & safety basics: offline evaluation harnesses, lightweight online A/B tests, and guardrails for PII and prompt-injection.
MLOps for LLMs: experiment tracking, versioning of prompts/configs, CI/CD for data & retrieval graphs, and production monitoring (latency, cost, drift).
Python scripting for data/LLM utilities and service integration (APIs, batching, retries).
Familiarity with BI tools (Power BI/Tableau) and semantic layer design.
Exposure to streaming, reverse ETL, and basic MDM/reference data management.
Security & governance awareness (role‑based access, least privilege, data retention).

Responsibilities

Design and implement end-to-end data architectures in Snowflake — from raw ingestion through staging, fact/dimension modeling, and semantic layer design.
Define data models that balance flexibility for analysts with performance and scalability for production.
Partner with engineering teams to integrate data from source applications and operational systems.
Establish versioned modeling standards and documentation to ensure consistency across domains.
Build or refine semantic layers that unify metric definitions across BI tools like Tableau, Power BI, or Looker.
Collaborate with business owners to define KPIs, approve new metrics, and monitor adoption.
Implement versioned datasets and definitions to support reliable analytics and reporting.
Architect feature pipelines and data contracts that support point-in-time correctness for machine learning models.
Collaborate with data scientists and AI engineers to implement reusable feature stores for both training (offline) and deployment (online) use.
Monitor data quality and prevent data leakage that could affect model performance.
Support event-driven architectures that bridge predictive models with operational systems.
Partner with AI teams to integrate structured and unstructured data into generative and agentic workflows (e.g., RAG, copilots, automated evaluation agents).
Design APIs or event structures that serve predictions and triggers in near real time.
Measure adoption and value of AI-driven workflows through data instrumentation.