Staff AI Infrastructure Engineer Why Nuclearn.ai Nuclearn.ai builds AI-powered software for the nuclear and utility industries—tools that keep critical infrastructure reliable, efficient, and safe. Our software integrates AI-driven workflow, documentation, and research automation, and is already used at 60+ nuclear reactors across North America. You'll ship production code operators and engineers rely on every day. We're growing quickly, expanding our team and our Phoenix HQ. The work is consequential: what you build helps real plants run safer and smarter. Eligibility: U.S. citizenship or permanent residency (green card) is required due to DOE export compliance. What You’ll Do Own AI hardware architecture end-to-end Design GPU/CPU systems that deliver consistent, high-performance AI workloads. Define storage, networking, container runtime, and OS standards in partnership with ML and platform engineering teams. Run and scale our Phoenix AI data center Manage rack architecture, power/cooling constraints, redundancy, monitoring, firmware lifecycle, and capacity planning. Identify bottlenecks early and fix them before they impact production. Partner directly with utility IT teams Analyze and validate customer infrastructure intended to host Nuclearn applications. Conduct architecture reviews, confirm configuration alignment, and prevent GPU/runtime incompatibilities before go-live. Drive hardware lifecycle evolution Plan GPU refreshes, expansion pathways, and just-in-time capacity upgrades to ensure infrastructure keeps pace with model complexity and platform growth. You will operate as a senior individual contributor with high autonomy and direct influence across engineering, ML, product, and customer environments. Examples of problems you might own in your first 90 days Develop and publish a clear AI hardware requirements standard for both internal deployment and customer-facing environments — including GPU sizing models, storage thresholds, networking requirements, and supported configurations. Analyze and validate a utility customer’s proposed infrastructure architecture before deployment — identifying performance gaps, GPU/runtime misalignment, or security configuration issues and providing concrete remediation guidance. Audit the Phoenix data center and execute just-in-time infrastructure upgrades — adding GPU capacity, expanding storage, or rebalancing workloads to maintain sustained high-performance AI execution as usage scales. What Makes You a Great Fit Degree in Computer Engineering, Electrical Engineering, Computer Science, or equivalent practical experience Proficiency in Linux server administration and GPU-based AI systems Strong experience deploying and tuning NVIDIA GPU environments for ML workloads Familiarity with containerized runtimes (Docker, Kubernetes) and AI model hosting Excellent troubleshooting skills at the hardware/software boundary Ability to operate independently in a fast-moving, high-ownership startup environment You are hands-on. You think in systems. You move quickly without sacrificing rigor. You are comfortable being the technical authority in the room when discussing infrastructure with senior engineers or enterprise IT leaders. Nice To Have (not Required) Experience in utility IT, energy infrastructure, or other regulated industries Experience supporting on-prem or air-gapped environments Prior responsibility for production data center operations Familiarity with cybersecurity expectations common to critical infrastructure environments Impact You’ll Have (near-term roadmap) Establish a standardized AI hardware reference architecture used across all deployments Build a scalable infrastructure refresh strategy that prevents hardware drift and obsolescence Make AI infrastructure a strategic advantage — stable, scalable, and trusted by customers Compensation & Benefits Base salary: $120k - $165k Equity: 0.025% - 0.125% Benefits: Unlimited PTO, health/dental/vision insurance, 4% 401k match Work Model & Schedule Full-time, salaried Mon–Fri hybrid (Wed remote); expectation is ≥80% in-office (Phoenix HQ) How We Hire (fast, respectful, practical) 20-min intro with the founder/hiring manager to trade context and assess mutual fit Practical work sample (60–90 min; a real task in our stack) Team meet + peer programming (system design + collaboration) We aim to move from first chat to decision quickly.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Senior