System Architect Director - AI Platform Engineering

CNA Insurance•Chicago, IL

1d•$97,000 - $189,000•Hybrid

About The Position

The System Architect (SA) Director for AI Platforms Engineering serves as the technical owner for the enterprise AI platform which is the shared foundation powering all AI and GenAI products across the organization. This leader owns the platform's architecture, engineering standards, and delivery roadmap, translating strategic AI capabilities into reliable, scalable, and governed platform capabilities that accelerate every product team building on top of them. Working in close partnership with Enterprise Architects, Product Management, and Release Train Engineers (RTEs), the SA Director ensures that platform investments are tightly aligned to business outcomes, compliance requirements, and engineering excellence. This role combines the strategic depth of a principal architect with the hands-on leadership of a delivery-focused engineering director.

Requirements

Deep AI Platform and AIOps engineering expertise, including hands-on experience designing, deploying, and operating shared AI platform capabilities such as model serving layers, LLM gateway and proxy services, prompt registries, vector databases, and automated evaluation harnesses at enterprise scale.
Proven agentic system design capability, with hands-on experience architecting multi-agent and single-agent workflow systems using orchestration frameworks such as Lang Graph, Google ADK — including tool and function calling patterns, state and memory persistence strategies, and robust safe autonomy controls.
Applied GenAI depth spanning LLM solution architecture patterns, model selection and routing strategies, advanced prompt engineering techniques, fine-tuning and RLHF tradeoffs, and production-grade RAG and hybrid retrieval system design and optimization.
Strong cloud-native and distributed systems architecture skills, with deep GCP expertise across Vertex AI, Cloud Run, GKE, Pub/Sub, and BigQuery, and a solid command of API and service-based design, event-driven architecture, and high-availability and fault-tolerant system patterns.
Knowledge grounding and semantic layer proficiency, including experience building canonical ontology and entity models, designing vector search and hybrid retrieval pipelines, integrating knowledge graphs, implementing reranking strategies, and establishing citation and traceability mechanisms that support compliance.
Solid AIOps and platform reliability engineering experience, including CI/CD pipeline design for AI systems, automated evaluation and quality gates, model and dataset versioning, production monitoring and observability, reliability engineering practices, and systematic cost-performance optimization.
Practical responsible AI and security expertise, with demonstrated experience implementing enterprise AI governance frameworks, model risk management programs, PII and data privacy controls, audit and event logging, and compliance-by-design patterns suited to regulated industries.
Strong SDLC and hands-on engineering fundamentals, including Python proficiency, architectural and code review practices, comprehensive testing strategies for AI systems, technical debt management, refactoring discipline, and operational readiness standards.
Scaled Agile (SAFe) leadership experience, including decomposing long-horizon strategy into actionable Enabler Epics, shaping PI planning outcomes.
Exceptional leadership and communication skills, with a demonstrated ability to influence senior stakeholders and cross-functional teams, negotiate complex technology tradeoffs, mentor and develop engineers at all levels, and translate deep technical concepts into compelling narratives for non-technical business audiences.
Bachelor's degree in Computer Science, Software Engineering, Information Technology, or equivalent required; Master's degree in AI, Machine Learning, Data Science, or related discipline strongly preferred.
10+ years in software engineering and technical delivery, with demonstrated ownership of large-scale, distributed enterprise systems across the full SDLC from inception through production operations.
5+ years in system or solution architecture, with a track record of producing reference architectures, design patterns, technical standards, and enterprise-scale platform guardrails.
5+ years of direct people leadership, including hiring, performance management, career development, and building high-performing engineering and architecture teams.
5+ years hands-on designing, delivering, and operating AI/ML or GenAI platform capabilities in production, with measurable outcomes in quality, reliability, and developer adoption.
Strong Python proficiency and deep practical GCP experience — Vertex AI, GCP Agent Builder, and Gemini — with the ability to engage credibly in hands-on technical work alongside the engineering team.
Prior experience in regulated industries (insurance, financial services, or healthcare) strongly preferred, given stringent governance, auditability, and model risk management requirements.
Consulting or enterprise delivery background is a plus, bringing structured problem-solving and stakeholder management

Responsibilities

Own and continuously evolve the enterprise AI Platform reference architecture, encompassing all critical layers including model serving, orchestration engines, data and knowledge grounding pipelines, observability infrastructure, and ensuring the platform scales reliably to enterprise-grade workloads and usage patterns.
Define and enforce platform-wide standards, reusable design patterns, and golden-path templates that enable product and feature teams to build, deploy, and operate AI solutions safely, consistently, and with significantly reduced time-to-production.
Drive end-to-end delivery of new platform capabilities — from initial technical discovery and architecture design through prototyping, hardening, and full production rollout while maintaining meaningful hands-on involvement at critical technical milestones to ensure quality and coherence.
Architect and operationalize the core platform service catalog, including LLM gateway and routing layers, prompt lifecycle management, agentic orchestration frameworks, Retrieval-Augmented Generation (RAG) pipelines, vector stores, model registries, and rigorous automated evaluation infrastructure.
Build and maintain robust CI/CD and AIOps pipelines specifically designed for AI systems, incorporating automated evaluation gates, model and data versioning controls, staged deployment promotion, and continuous cost and performance optimization guardrails.
Architect enterprise-grade multi-agent and single-agent workflow patterns for high-value business use cases, establishing clear standards for orchestration design, state and memory management, tool and API integration, and safe autonomy controls including human-in-the-loop approvals, permission scoping, and comprehensive audit trails.
Design and implement knowledge grounding systems — spanning hybrid retrieval strategies, semantic reranking, ontology-driven entity modeling, and knowledge graph integration — to measurably improve AI output accuracy, traceability, and readiness for regulatory audit.
Embed responsible AI and compliance-by-design principles into every layer of the platform, covering data privacy protections, enterprise secrets management, granular access controls, output leakage prevention, and model risk governance practices aligned to enterprise and regulatory standards.
Actively shape PI Planning by authoring well-defined Enabler Epics and articulating architectural outcomes that anchor near-term delivery and long-horizon platform capability roadmaps, while contributing expert WSJF input to balance platform investment against feature team needs, risk reduction, and time-to-impact.
Directly manage, mentor, and grow a high-performing team of platform engineers, solution architects, and technical specialists — hiring hands-on builders, coaching technical leadership skills, and sustaining a healthy innovation pipeline that continuously advances the organization's AI platform maturity.
May perform additional duties as assigned.

Benefits

Comprehensive and competitive benefits package to help our employees – and their family members – achieve their physical, financial, emotional and social wellbeing goals.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume