Data Scientist

Jobber

8d•CA$125,800 - CA$170,100•Hybrid

About The Position

Are you passionate about ML model quality and building trust in AI systems? If so, this might be the role for you! We’re looking for a Data Scientist to join our growing ML & AI practice. You will play a key role in ensuring the reliability, accuracy, and performance of our machine learning models, from evaluation framework design to ongoing production monitoring. If you’re someone who geeks out over loss functions, loves writing rigorous evals, and takes pride in knowing a model truly works before it ships, we want to hear from you! At Jobber, we don’t just build a product, we work on real problems that help people in small businesses become successful. We are inspired by our company values: be humble, be supportive, and give a shit, which are not just said but are lived. We work in a collaborative environment where teams make decisions with autonomy and contribute directly to shaping the company’s future. The Strategy and Analytics Department ensures our people at Jobber have the tooling, data insights, and strategic direction to excel in our shared mission. We turn data into actionable insights, and critical business needs into impactful software, working with multiple teams and departments across the company. Strategy & Analytics serves as a central hub that drives business outcomes in all corners of Jobber’s ecosystem. Reporting to the Director, Data Science, the Data Scientist will be a core contributor to the quality and reliability of our ML and AI systems. Your primary focus will be ML model validation and monitoring, designing and executing MCP (Model Context Protocol) evaluations, and building regression test suites that give the team confidence at every stage of the model lifecycle. You will work closely with senior data scientists, MLOps, and product teams to keep our models honest and our business protected.

Requirements

Industry experience in data science, machine learning, or a closely related quantitative field.
Proficiency in Python and the core DS stack: Pandas, Scikit-Learn, XGBoost, and at least one deep learning framework (PyTorch or TensorFlow).
Solid grasp of statistical concepts underpinning model evaluation: bias–variance tradeoff, calibration, confidence intervals, A/B testing, and data drift.
Experience with LLM evaluation frameworks (e.g. RAGAS, Eleuther AI Eval Harness, or custom LLM eval pipelines).
Hands-on experience designing custom evaluation metrics; you've gone beyond off-the-shelf metrics when the problem demanded it.
Strong understanding of ML and LLM model architectures — you can reason about how a model is built and why it behaves the way it does.
High proficiency in SQL for data exploration, feature validation, and debugging model inputs.
Exceptional attention to detail — you treat model validation with the same rigour as software QA.
Strong written and verbal communication skills; comfortable presenting findings to both technical peers and non-technical stakeholders.

Nice To Haves

Experience building model evaluation or monitoring solutions within a Snowflake environment, e.g. logging predictions and ground truth to Snowflake tables, computing eval metrics via Snowpark, or building monitoring dashboards on top of Snowflake data.
Familiarity with Snowpark (Python) for running data transformations and ML workflows directly within Snowflake.
Exposure to model regression testing patterns in a CI/CD context (e.g. running evals on every PR).
Familiarity with prompt engineering and evaluation strategies for LLM-powered features.
Experience working in a SaaS environment and an appreciation for how model quality translates to customer impact.

Responsibilities

Design, implement, and maintain ML model validation frameworks, including custom evaluation metrics, loss functions, and statistical tests, to ensure model quality before and after deployment.
Build and own regression test suites for ML and LLM models, catching performance regressions and unexpected behaviour across model updates and data drift scenarios.
Develop and execute MCP evaluations, systematically assessing model capabilities, edge cases, and failure modes across relevant business contexts.
Monitor models in production using statistical process control, drift detection, and alerting pipelines; proactively surface issues before they impact customers.
Collaborate with senior data scientists to contribute to the design and refinement of ML model architectures, offering feedback grounded in validation results.
Document evaluation methodologies, test results, and monitoring runbooks clearly enough that stakeholders across technical and business teams can understand model health.
Stay current with advancements in LLM evaluation techniques, AI safety, and model observability, and apply emerging best practices to our workflows.
Communicate findings clearly and concisely to stakeholders, translating model performance signals into actionable recommendations.