About The Position

NVIDIA DGX Cloud is the AI supercomputing-as-a-service substrate designed to power the next generation of AI and industrial-scale breakthroughs. As a Security Engineer within our Infrastructure Security Engineering organization, you will not just help "secure" our platform—you will architect and build the foundational security primitives that protect massive-scale GPU clusters. You will design automated, resilient security systems that help ensure the integrity of our omni-cloud and on-premise AI infrastructure. We truly recognize that a candidate who checks every single box is simply rare. We aren't looking for a checkbox hire; we are looking for high-caliber engineers with deep spikes of expertise in a few of these areas and the intellectual curiosity to dive into the rest. If your experience aligns with the core of this role—building resilient security systems—and you can show us how, we want to hear from you!

Requirements

  • Infrastructure Engineering: Bachelors degree or equivalent experience with 5+ years in SRE, Software Engineering, and Infrastructure Security. You focus on building systemic solutions rather than performing manual operations or "tool administration."
  • Production-Grade Coding: A strong software engineering background with the ability to write clean, maintainable, and well-tested code. You should be comfortable building and maintaining production service at scale.
  • Distributed Systems Expertise: Understanding of cloud-native architecture, container orchestration (Kubernetes), and the security challenges inherent in high-throughput, low-latency environments.
  • Platformizing Security: Transform complex security requirements into consumable internal services. You will focus on the "Developer Experience" of security, ensuring that our infrastructure security controls are delivered as robust, API-first platforms that integrate seamlessly with NVIDIA’s internal engineering workflows.
  • Security Product Integration: Proven track record of taking complex security products (AuthN/AuthZ, Vaulting, Scanning, IDS) and integrating them into an automated infrastructure via APIs and custom glue-code.
  • Linux Internals: Strong hands-on experience with Linux systems security, including kernel-level primitives (eBPF, AppArmor, or SELinux).

Nice To Haves

  • HPC/AI Security: Experience securing high-performance computing environments, RDMA-based networks, or GPU-specific security challenges.
  • Cloud-Native Identity: Expertise in workload identity frameworks (e.g., SPIFFE/SPIRE) and hardware-root-of-trust (TPM/HSM) integration.
  • Open Source Impact: Notable contributions to security-focused open-source projects or a track record of engineering-focused security research. How have you represented and helped advance the industry?

Responsibilities

  • Security Engineering: Design, build, and integrate production-grade security services. You will focus on the engineering of security products—transforming third-party and open-source tools into seamless, API-driven components of the DGX Cloud security stack.
  • Automated Policy Enforcement: Shift security "left" by developing Infrastructure as Code and Policy as Code to automate security enforcement and compliance at the speed of cloud-scale deployment.
  • Orchestration Security & Guardrails: Architect and implement the security control plane. You will engineer automated guardrails, controllers, and runtime security policies that validate and enforce the integrity of tenant boundaries.
  • Security-as-a-Service Approach: Designing and operating security services as a scalable platform. Building "self-service" security primitives (e.g., Identity-as-a-Service, automated secrets management, and real-time scanning APIs) that allow developer teams to move fast.
  • Security Tooling & Lifecycle: Develop internal security frameworks and automated response systems. Responsible for the full software development lifecycle (SDLC) of the security tools, including testing, deployment, and maintenance.
  • Threat Modeling & System Design: Conduct deep-dive threat models on complex distributed systems and the DGX Cloud stack, identifying architectural gaps in security and engineering the solutions to close them.
  • Multi-Functional Collaboration: Partner with DGX Cloud platform teams, broader NVIDIA security teams, and product engineering to understand their needs and build paved paths that seamlessly embed security into the CI/CD pipeline and the hardware lifecycle.

Benefits

  • You will also be eligible for equity and benefits.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service