Senior Machine Learning Engineer

Seven AI•Boston, MA

50d

About The Position

We are seeking a Senior Machine Learning Engineer to join our core engine team. You will work on developing and enhancing our multi-agent AI technology, focusing on building our core LLM technology and services. Your role involves delivering a robust, scalable, cloud-native architecture to production while optimizing for precision, recall, and costs. About Seven AI: We are seeking professionals of all levels who are eager to make a substantial impact and excel in a high-growth, dynamic environment. As AI is advancing at a pace never seen before, you'll join us at a pivotal stage, where your expertise can shape the future of cybersecurity. You'll have the opportunity to work on the bleeding edge of technology and drive true innovations, all while collaborating closely with industry veterans who are dedicated to defend the market from the new wave of AI-driven attacks. Our culture is centered around respect, collaboration and proactiveness, and a shared commitment to delivering exceptional value to our customers. If you’re passionate about building something extraordinary and thrive in an environment where your contributions truly matter, we’d love to connect with you.

Requirements

5+ years of experience in delivering cloud-based ML applications to production, analyzing their performance and driving improvements.
Strong theoretical background in statistical analysis, ML, and graph theory.
Bachelor's degree or equivalent in a relevant field.
Experience with cloud-native architectures and deploying ML solutions on cloud platforms.
Excellent problem-solving and communication skills.
Highly curious. Able to work backwards from customer value and deliver iterative value quickly.

Nice To Haves

A graduate level degree is an advantage.

Responsibilities

Develop and maintain core algorithms and services for our multi-agent AI technology.
Deliver scalable, cloud-native architecture to production and optimize algorithms for precision, recall, and cost.
Collaborate with cross-functional teams to integrate ML solutions into our platform.
Ensure the reliability of ML models in production by continuously monitoring and improving model performance.
Optimize current and future custom transformer-based language models for latency and throughput, utilizing NVIDIA/device accelerators like CUDA, using TensorRT, TensorRT LLM, etc.
Augmenting large language model agents with multi-model data, search indices, and advanced retrieval techniques - including Retrieval-Augmented Generation (RAG) and Graph-Retrieval Augmented Generation (Graph-RAG) - to enable complex, scalable search infrastructures within agentic systems.
Fine-tune transformer based language models

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume