Staff Developer - AI/ML

Benevity•Calgary, AB

1d•Hybrid

About The Position

The Staff AI/ML Engineer will be the technical anchor for our machine learning and AI efforts, and will lead the architecture of our GenAI strategy. You will bridge the gap between theoretical research and production-grade engineering. This role requires a deep understanding of designing scalable systems that powers our model training, deployment, and monitoring - ensuring our AI initiatives deliver measurable business value. The ideal candidate will have a strong technical background combined with leadership experience.

Requirements

Bachelor’s or Master’s degree in Computer Science, Mathematics, or a related field, or equivalent deep professional experience.
7+ years of software engineering experience, with at least 4+ years architecting and deploying ML models in production at scale.
Proven experience operating at a Staff or Senior level, including technical leadership, architecture ownership, and mentoring engineers.
Deep expertise in MLOps and production ML systems, including model training, evaluation, deployment, monitoring, and lifecycle management.
Strong experience with cloud platforms (AWS, Azure, or Google Cloud), including designing and operating scalable, distributed AI/ML workloads.
Solid understanding of data architecture and data engineering, including data pipelines, feature engineering, data modeling, and large-scale data processing.
Experience with ML infrastructure and tooling, such as feature stores, experiment tracking, model registries, and orchestration frameworks.
Proficiency in Python and ML frameworks (e.g., TensorFlow, PyTorch, scikit-learn), with strong software engineering fundamentals.
Experience with CI/CD and ML deployment pipelines, including automated testing, validation, and rollback strategies for ML systems.
Familiarity with LLM-based systems and GenAI applications, including systems design, evaluation strategies, and observability for non-deterministic systems.
Strong understanding of LLM architectures and trade-offs, including model selection, latency, cost, and quality optimization.
Experience with prompt engineering and prompt orchestration, including techniques like few-shot learning, chain-of-thought, and tool/function calling.
Experience designing and implementing RAG (Retrieval-Augmented Generation) systems, including embedding strategies, vector databases, and retrieval optimization.
Experience building agentic workflows, including multi-step reasoning, tool use, and orchestration frameworks (e.g., LangChain, LlamaIndex, ADK, or custom frameworks).
Strong understanding of system design for reliability and scalability, including distributed systems, APIs, and microservices architecture.
Knowledge of data governance, model governance, and responsible AI practices (security, privacy, bias, explainability).
Demonstrated ability to translate ambiguous business problems into scalable AI/ML solutions.
Excellent communication and collaboration skills, with the ability to influence stakeholders and drive cross-functional alignment.

Nice To Haves

Certification in relevant cloud platforms or technologies.

Responsibilities

Design and oversee the development of robust end-to-end ML architecture, from data ingestion and feature stores to model serving and monitoring.
Define the long-term roadmap strategy for our AI infrastructure and orchestration framework.
Oversee the design, implementation, and maintenance of our AI/ML ecosystem.
Set the standard for MLOps - you will ensure that our ML ecosystem is as testable, maintainable, and scalable as our core application code.
Cross-functional leadership - by working closely with Product Managers, Data Scientists and ML Engineers to translate business problems into concrete technical requirements.
Act as a force multiplier for the team by conducting high-level design reviews and mentoring engineers on system design and performance optimization.
Architect and evolve LLM-powered applications (e.g., copilots, search, assistants, agents), including RAG pipelines, tool integrations, and multi-step reasoning workflows.
Design and implement robust evaluation frameworks for GenAI systems, incorporating offline benchmarks, online metrics, and human-in-the-loop feedback.
Drive best practices for prompt engineering, agent design, and orchestration frameworks, ensuring maintainability and performance at scale.
Establish guardrails and safety mechanisms for GenAI applications, including prompt injection defenses, hallucination mitigation, and responsible AI practices.
Establish the golden path for model versioning, A/B testing, and automated rollbacks for identifying and mitigating drifts.
Ensure AI architectural strategy aligns with industry best practices and standards, complies with security policies and industry regulations.
Identify opportunities for process improvements and implement solutions to enhance platform performance and efficiency.