At d-Matrix, we are focused on unleashing the potential of generative AI to power the transformation of technology. We are at the forefront of software and hardware innovation, pushing the boundaries of what is possible. Our culture is one of respect and collaboration. We value humility and believe in direct communication. Our team is inclusive, and our differing perspectives allow for better solutions. We are seeking individuals passionate about tackling challenges and are driven by execution. Ready to come find your playground? Together, we can help shape the endless possibilities of AI. D-Matrix Frontier Group sits at the leading edge of what’s possible with LLM inference on heterogeneous hardware. Our charter spans the full stack: from pathfinding emerging use cases and novel deployment patterns to deep optimization of inference kernels, to building proof-of-concept systems that showcase D-Matrix’s unique computational fabric. We are an applied research and engineering team that moves fast, ships real systems, and works directly with product and hardware teams to shape the roadmap. We build the tools, runtimes, and frameworks that let frontier AI models run efficiently and cost-effectively across heterogeneous deployments — combining D-Matrix silicon with CPUs, GPUs, and custom accelerators. Our work powers everything from benchmarking and evaluation pipelines to production-grade inference serving. This Role We are hiring end-to-end inference engineers who are comfortable going from a novel research idea to a deployed, optimized system. You will work at every layer of the inference stack — from kernel-level optimization to distributed orchestration to high-level serving APIs. This role could be a great match for you if you: • Have deep intuition for modern generative AI architectures and how to squeeze performance out of them at inference time. • Are familiar with the internals of open-source inference frameworks (vLLM, SGLang, TensorRT-LLM, etc.) and can extend or replace them when needed. • Enjoy pathfinding new use cases — exploring heterogeneous deployment topologies and building early-stage POCs that prove out new ideas. • Are results-oriented with a strong bias toward action; you own problems end-to-end from prototype to optimization to handoff. • Are energized by working at the intersection of novel hardware and frontier models, and want your work to directly influence how next-generation AI silicon is used. • Value clear communication and thrive in a small, high-ownership team environment.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Principal