About The Position

CIQ is seeking a highly experienced Senior or Principal AI Engineer to own and drive AI/ML innovation across our product portfolio. This role sits at the intersection of AI engineering and systems performance - the right candidate brings deep expertise in model inference optimization, training workflows, and production AI deployment, combined with a strong instinct for performance at the systems level. In this role, you will be the AI engineering standard-bearer at CIQ. You will design and build turnkey AI workload examples - both internal reference pipelines and customer-facing solutions - ensuring that CIQ’s AI story is always compelling, practical, and demonstrably best-in-class. You will integrate deeply with Fuzzball, CIQ’s cloud-native computing platform, running AI workloads end-to-end through it and helping customers do the same.

Requirements

  • Deep, hands-on expertise in LLM inference optimization: including serving frameworks (vLLM, TensorRT-LLM, ONNX Runtime), quantization techniques, and GPU memory management.
  • Strong background in distributed AI training, including frameworks such as PyTorch FSDP, DeepSpeed, Megatron-LM, or JAX/XLA.
  • Proven experience building production AI pipelines and packaging AI environments for reproducible, portable deployment (containers, Apptainer/Singularity, or equivalent).
  • Fluency with GPU/accelerator profiling tools: NVIDIA Nsight, PyTorch Profiler, CUDA performance analysis, and related tooling.
  • Familiarity with HPC environments: job schedulers (Slurm, PBS), parallel filesystems, RDMA/InfiniBand, and MPI, and the intersection of HPC with modern AI workloads.
  • Experience integrating AI workloads into CI/CD pipelines and building automated testing and benchmarking frameworks.
  • Comfort using and building with LLM-based tools and agentic frameworks to accelerate engineering work.
  • Excellent analytical skills and able to form hypotheses, design experiments, and draw actionable conclusions from complex profiling data.
  • Strong written and verbal communication skills; able to present findings to both deeply technical audiences and business stakeholders.
  • A collaborative, humble, and always-learning mindset, combined with the confidence to champion AI engineering as a first-class concern.

Nice To Haves

  • Experience working in or with open-source AI ecosystems (PyTorch, Triton, ONNX, Hugging Face, etc.) is a strong plus.
  • Background with cloud-native, containerized, and/or HPC computing environments preferred.

Responsibilities

  • Design, implement, and tune inference pipelines for large language models and other AI workloads, targeting maximum throughput and minimum latency.
  • Apply state-of-the-art optimization techniques: quantization (INT4/INT8/FP8), model pruning, speculative decoding, continuous batching, and kernel fusion.
  • Optimize inference-serving stacks, including vLLM, TensorRT-LLM, ONNX Runtime, and similar frameworks, for production deployment on CIQ’s OS platform.
  • Profile and tune GPU/accelerator utilization across the full inference stack, from model weights and memory bandwidth to CUDA kernels and driver overhead.
  • Establish inference performance baselines and regression detection across CIQ’s AI-focused solutions.
  • Design and optimize distributed training pipelines for large-scale models, including data, model, tensor, and pipeline parallelism strategies.
  • Tune training efficiency through mixed-precision training, gradient checkpointing, activation recomputation, and optimizer-level improvements.
  • Benchmark training throughput and scaling efficiency across multi-GPU and multi-node configurations on CIQ’s infrastructure.
  • Collaborate with infrastructure and performance teams to resolve training bottlenecks at the network (RDMA/InfiniBand), storage, and OS layers.
  • Stay current on frontier model architectures and training techniques, including MoE models, RLHF pipelines, and emerging post-training methods.
  • Build and maintain a library of turn-key AI workload examples that run on CIQ’s platform, covering inference serving, fine-tuning, batch processing, RAG pipelines, and agentic workflows.
  • Develop both internal reference pipelines for CI/testing and customer-facing examples designed for immediate productivity on CIQ’s OS and Fuzzball.
  • Package workloads using containers to deliver portable, reproducible AI environments across HPC and cloud-native settings.
  • Create compelling, well-documented demos and reference architectures that communicate CIQ’s AI capabilities to technical and business audiences alike.
  • Partner with product and customer success teams to translate real-world AI use cases into reusable, production-quality examples.
  • Build and maintain AI-powered engineering tooling - leveraging LLM-based agents, automated analysis pipelines, and AI-assisted code generation to accelerate the broader engineering organization.
  • Champion an AI-first development culture: identify opportunities where AI tooling can reduce toil, surface insights faster, and improve software quality across CIQ’s products.
  • Evaluate and integrate emerging AI frameworks, libraries, and hardware as they become relevant to CIQ’s customers and product roadmap.
  • Contribute to open-source AI tooling and frameworks where relevant, reinforcing CIQ’s technical reputation in the community.
  • Develop deep expertise in CIQ’s Fuzzball platform, its architecture, scheduling model, and workload execution environment.
  • Integrate AI training, inference, and pipeline workloads into Fuzzball-based CI/CD and production pipelines.
  • Contribute to Fuzzball’s AI workload story: ensure the platform is a first-class environment for running AI workloads efficiently and at scale.
  • Help characterize and improve Fuzzball’s performance for AI-specific access patterns and resource demands.
  • Develop broad familiarity with the full CIQ product portfolio, including Rocky Linux and RLC (and its variants), Fuzzball, Apptainer, and Warewulf, and understand how AI workloads interact with each layer.
  • Collaborate closely with the Performance Engineering team to ensure AI workloads benefit from and contribute to CIQ’s systems-level optimization work.
  • Partner with product and customer success teams to translate real-world AI pain points into engineering priorities and measurable outcomes.
  • Document and communicate findings clearly, from low-level profiling data to executive-level summaries.
  • Contribute to technical publications, conference presentations, and thought leadership that reinforces CIQ’s reputation as an AI-forward infrastructure company.

Benefits

  • Medical, dental, and vision insurance.
  • Flexible paid time off.
  • Employee stock options.
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service