Senior AI Infrastructure Engineer

Stoke Space•Kent, WA

9h•$192,885 - $347,235•Onsite

About The Position

At Stoke, we believe a thriving space economy will enable a vibrant, sustainable, and equitable future here on Earth. That is why we’re building Nova, our fully and rapidly reusable launch vehicle. Designed for daily flight, Nova tackles the core challenges of space transportation by reducing cost, increasing availability, and improving reliability. By radically lowering launch costs and increasing flight cadence, we’re helping create a truly scalable space industry. Our team is mission-driven, collaborative, and empowered to take ownership of their work. If you want to work alongside some of the most dedicated and talented people on Earth, we’d love to have you join us. Reusable launch systems are the key to seamlessly connecting Earth and space. Just as our rocket systems are designed to be reliable, automated, and intelligent, the internal tools that power our engineering and business operations must embody those same principles to help our teams move faster, work smarter, and stay focused on the mission. We are looking for a Senior AI Infrastructure Engineer to design, build, and deploy AI-powered tooling across Stoke’s internal systems. As part of our IT Internal Tooling team, you will identify high-leverage opportunities to apply AI - including large language models, agentic systems, retrieval pipelines, and intelligent automation - and turn them into production-grade tools that meaningfully accelerate the people building rockets. This role requires deep technical expertise in modern AI engineering, software development, and systems integration, combined with a strong product instinct for what to build, what not to build, and how to ship AI tooling that actually works. You will work closely with engineers, operators, and business stakeholders across Stoke to understand their workflows, identify where AI can remove friction or unlock new capabilities, and design and implement robust solutions end-to-end. This is a high-impact role where your work directly shapes how our company operates and how quickly our teams can deliver. You must be ready to stay focused, move quickly, self-direct, and learn on the fly.

Requirements

Bachelor's or Master's degree in Computer Science, Software Engineering, or a related technical field, or equivalent practical experience
Combined 5–8 years of experience in software engineering, applied AI/ML, or related roles, with at least 2 years building and shipping production AI or LLM-powered systems
Proven track record of designing, deploying, and operating AI-powered applications at scale in a production environment
Strong proficiency in at least one modern language commonly used in AI tooling (e.g., Python, TypeScript, Go)
Deep, hands-on experience with modern LLM tooling and patterns, including prompt engineering, function/tool calling, agents, RAG, and structured output
Working knowledge of one or more major foundation model providers (e.g., Anthropic, OpenAI, AWS Bedrock, Azure OpenAI) and the trade-offs between them
Strong software engineering fundamentals: testing, code review, CI/CD, version control, observability, and clean API design
Experience integrating with enterprise systems via REST APIs, webhooks, message queues, and event-driven architectures
Solid understanding of evaluation methodologies for AI systems, including offline evals, golden datasets, and online quality metrics
Strong understanding of AI security and privacy concerns, including data leakage, prompt injection, model misuse, and least-privilege data access
Excellent communication skills and the ability to translate fuzzy business problems into concrete, shippable AI solutions

Nice To Haves

Experience building agentic systems with multi-step tool use, planning, and long-running workflows
Experience with vector databases, embedding models, hybrid search, and advanced retrieval techniques
Experience deploying AI workloads on AWS (Bedrock, Lambda, ECS/Fargate) or comparable cloud platforms
Experience with model fine-tuning, distillation, or training custom models for narrow internal use cases
Familiarity with ITAR, export controls, or other regulated-data environments and the implications for AI tooling
Experience with MCP (Model Context Protocol) servers, internal developer platforms, or building reusable AI infrastructure for other engineers
Background in IT, internal tools, or digital transformation functions, with a track record of successful enterprise software rollouts
Passion for learning new technologies and sharing knowledge with teammates

Responsibilities

Design, develop, and deploy AI-powered internal tools end-to-end – from initial use-case discovery and prototyping through production deployment, monitoring, and iteration
Build LLM-powered applications, agents, retrieval-augmented generation (RAG) systems, and intelligent workflow automations that integrate with Stoke’s internal systems and data sources
Partner with stakeholders across engineering disciplines, manufacturing, supply chain, finance, and other teams to identify high-leverage opportunities for AI tooling and translate them into clear technical requirements
Evaluate and select foundation models, frameworks, and platforms; build robust prompt, evaluation, and guardrail systems; and make pragmatic build-versus-buy decisions
Develop and maintain the AI engineering stack used by the team, including model gateways, vector stores, evaluation harnesses, observability, and deployment pipelines
Implement rigorous evaluation, testing, and monitoring practices for AI systems, including offline evals, online metrics, regression tracking, and human-in-the-loop review
Design and implement integrations with internal systems and data including; telemetry data, systems requirements, document stores, code repositories, and business applications – using APIs, webhooks, and event-driven patterns
Apply strong security, and compliance practices to AI tooling, including data handling, access controls, prompt injection defenses, and ITAR/export-control considerations
Operate AI services in production, including capacity planning, cost optimization, latency tuning, and incident response
Produce clear technical documentation, runbooks, and architectural decision records; mentor team members on applied AI best practices and help raise the bar across the organization

Benefits

Equity in the form of stock options
Comprehensive benefits program including subsidized medical, dental, and vision insurance
Company-paid life and disability insurance
401(k) plan with employer match
4 weeks’ Paid Time Off
10 days Holidays (including an end-of-year closure)
Paid Family/Parental Leave
On-site gym or monthly wellness stipend (depending on location)
Dog friendly offices!

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume