Senior Machine Learning Engineer

Evolution Cloud Services (EVOCS)

12h

About The Position

EVOCS’s journey began with a mission to empower businesses with advisory expertise, empowered with ideal technologies to provide them with comprehensive solutions to grow and prosper. Founded by a team of passionate experts, EVOCS has grown into a trusted partner to a growing number of leaders across their respective industries. Our roots in employee-managed operations reflect our commitment to quality, consistency, and client success. If you enjoy working in a hyper-fast-growing company, are eager to be part of an agile team, and want to be part of our success story, then let’s talk! As a Senior Machine Learning Engineer, you will be the person we trust with the training side of our AI work. You’ll decide what to build, how to build it, and whether to build it at all. You will be responsible for the quality of the models we ship: the data they learn from, the pipelines that produce them, and the judgment calls that separate a useful model from an expensive one. You’ll be mentoring engineers who’ve never watched a loss curve diverge and felt something.

Requirements

5+ years of ML engineering experience, with meaningful time spent fine-tuning transformer models end-to-end.
Strong Python and PyTorch, plus fluency with the Hugging Face stack (transformers, datasets, accelerate, peft, trl).
You’ve already built or seriously operated a distributed training setup and know how to set that up.
Azure AI Foundry experience (or strong Azure ML adjacent and willingness to get deep), plus SQL and at least one data pipeline tool (dbt, Airflow, Dagster, or Spark).
Experiment tracking discipline (W&B, MLflow, or a spreadsheet you defend philosophically) and the usual engineer stuff.
You should know Git, Docker, and have the ability to actually ship.
Fluent in English (written and spoken) – bilingual or near-native level
Strong interpersonal and communication skills – this is a client-facing role that involves frequent interaction via email, calls, and meetings

Nice To Haves

Bonus for JAX; extra bonus for having read a CUDA kernel and not flinched.
Run quantized models locally — GGUF, GPTQ, AWQ, MLX — and know what K-quants are and why Q4_K_M is usually the sweet spot.
Familiarity with the whisper.cpp / flash-moe universe — efficient inference on hardware that shouldn’t be able to do that.
A strong take on MoE routing, speculative decoding, or why KV-cache management is more interesting than it has any right to be.
RLHF, DPO, or preference data curation experience.

Responsibilities

Be the primary person responsible for the data piece of our AI initiatives.
Build and maintain training and retraining pipelines in Azure AI Foundry.
Make the real model design calls because you know best which way to go: full fine-tune vs. LoRA/QLoRA vs. DPO vs. “better prompting would save us three weeks.”
Run hyperparameter work that isn’t a grid search copied from a 2021 Medium post.
Operate distributed training setups and know what breaks at scale.
Design eval harnesses that catch what’s actually wrong, with a skeptical eye on benchmark contamination.
Ship models into production as the load-bearing piece of the product, not a feature slapped on the side.
Mentor engineers who can call an inference endpoint but have never trained one themselves.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume