Software Engineer - AI PLATFORM ENGINEER | Onsite | Addison, TX / Charlotte

Photon

118d•Onsite

About The Position

We are looking for an AI Platform Engineer—a builder who can architect the "factory" where AI is made. Our goal is to build an internal, on-premises AI ecosystem that mimics the capabilities of AWS or Azure. You will be responsible for creating a horizontal platform used by various lines of business to deploy AI projects simultaneously.

Requirements

Expert Python: Deep, hands-on knowledge is mandatory.
Data Engineering: Extensive experience in massive data ingestion and processing.
RAG Expertise: Deep understanding of vector databases, inferencing, and advanced chunking strategies.
Platform Engineering: Proven experience building tools/platforms that other developers or business units use.
Infrastructure Knowledge: Experience mimicking cloud capabilities (AWS/Azure) within a strictly on-premise environment.
DevOps: Familiarity with Jenkins, Ansible, and automated deployment pipelines.
Seniority: This is a senior-level role. We are looking for someone with a proven track record of building production-grade platforms (10-15+ years)
Industry Knowledge: You must stay current with the "latest and greatest" in AI (e.g., rag-less inferencing, agentic frameworks).
Problem Solver: Must be able to take a use case from a business unit and translate it into a scalable platform service.

Nice To Haves

Experience with Scale: Experience working with large-scale GPU farms and high-volume data environments is highly preferred.

Responsibilities

Platform Architecture: Design and develop a "Model-as-a-Service" platform that allows non-experts to use drag-and-drop components to build AI solutions.
RAG-as-a-Service: Build and optimize end-to-end Retrieval-Augmented Generation (RAG) pipelines, including sophisticated chunking strategies and vector database management.
Tooling & Libraries: Develop and maintain MCP (Model Control Protocol) libraries, clients, and servers to connect various data sources to the AI engine.
Infrastructure Management: Help manage and optimize one of the largest on-premise GPU farms in the U.S. banking sector (500+ Nvidia nodes).
Agentic AI: Build a repository for Agentic AI where users can select existing agents or build custom ones for specialized tasks.
CI/CD Integration: Integrate AI deployment pipelines with enterprise-level CI/CD tools like Jenkins and Ansible.
Compliance & Guardrails: Implement corporate-level guardrails and work within Model Risk Management (MRM) frameworks to ensure all AI deployments are secure and compliant.

Benefits

Medical, vision, and dental benefits, 401k retirement plan, variable pay/incentives, paid time off, and paid holidays are available for full time employees.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume