Backend AI Engineer

Stefanini Group•San Francisco, CA

14h•Hybrid

About The Position

Stefanini Group is hiring! Stefanini is looking for Backend AI Engineer -Hybrid For quick apply, please contact Akash Gupta; Ph: 248 728 2603 [email protected] W2 Only! Key Responsibilities: Design, develop, and maintain robust backend services and APIs to support AI/ML applications across the organization. Actively participate in Agile rituals and follow Scaled Agile processes as set forth by the CDP Program team. Deliver high-quality backend services following Safe Agile Practices Proactively identify and resolve issues with AI services, APIs, and model serving infrastructure. Deploy comprehensive monitoring and alerting for backend AI systems, implementing auto-remediation where possible to ensure system availability and reliability. Employ a security-first, testing, and automation strategy, adhering to backend engineering and MLOps best practices. Collaborate with cross-functional teams, including data scientists, front-end engineers, data engineers, and business stakeholders, to understand requirements and deliver robust backend solutions. Keep up with the latest trends and technologies, evaluating and recommending new tools, frameworks, and architectures to improve backend AI capabilities. What You'll Bring: Backend API Development (50%): Design and develop robust, scalable RESTful APIs and GraphQL services for AI/ML applications. Build backend services for LLM-powered applications, including RAG systems, document processing pipelines, and knowledge bases. Implement secure API endpoints for model inference, prompt orchestration, and AI service integration. Develop asynchronous processing workflows for long-running AI tasks (document analysis, batch predictions). Create middleware and service layers to abstract complex AI functionality for front-end consumption. Design and implement caching strategies, rate limiting, and API versioning for production AI services. Build event-driven architectures using message queues (SQS, SNS, EventBridge) for scalable AI workflows. Ensure APIs meet security, authentication, and authorization requirements for Federal environments. Implement comprehensive error handling, logging, and observability for AI services. MLOps & Infrastructure (30%) Build and maintain ML model serving infrastructure using AWS SageMaker, Lambda, and containerized deployments. Integrate AWS AI services (Bedrock, Textract, Comprehend) into backend pipelines and APIs. Develop CI/CD pipelines for automated testing, deployment, and rollback of AI services. Implement model versioning, A/B testing frameworks, and canary deployment patterns. Create data preprocessing and feature engineering pipelines using PySpark and Databricks. Build orchestration workflows for multi-step AI processes (ingestion → preprocessing → inference → post-processing). Develop monitoring and alerting systems for model performance, latency, cost, and availability. Collaborate with data scientists to produce ML models and transition from prototype to production. Implement vector databases and semantic search capabilities for RAG architecture. Manage prompt templates, model configurations, and AI service parameters as code. Performance & Support (20%) Optimize backend performance for high-throughput AI workloads and real-time inference. Monitor and troubleshoot production issues including latency, errors, and cost optimization. Implement automated remediation and self-healing capabilities for backend services. Conduct performance testing and capacity planning for AI infrastructure Actively participate in Agile rituals and follow Scaled Agile processes as set forth by the CDP Program team. Provide technical support and act as escalation point for backend AI service issues. Create comprehensive technical documentation for APIs, architecture patterns, and deployment procedures. Stay current on backend technologies, AI infrastructure trends, and Federal regulatory requirements. Collaborate with security teams to ensure compliance with data protection and privacy regulations. #LI-AG #LI-HYBRID Minimum Qualifications: Education: Bachelor's degree in Computer Science, Software Engineering, Information Systems, or related technical field, or equivalent experience Experience: 4+ years in backend development, with at least 2+ years building and deploying AI/ML or LLM-powered services. Programming: Strong Python proficiency; experience with frameworks like FastAPI, Flask, or Django API Design: Proven experience designing and implementing RESTful APIs, with understanding of API security and best practices LLM Integration: Hands-on experience building backend services for LLM applications, including prompt orchestration, RAG architectures, and AI service integration Cloud Platforms: Working knowledge of AWS services including Lambda, API Gateway, SageMaker, Bedrock, S3, and related AI/ML tools Database Experience : Proficiency with both relational (PostgreSQL, MySQL) and NoSQL databases (DynamoDB, MongoDB); experience with vector databases (Pinecone, Weaviate, pgvector) preferred Containerization: Experience with Docker and container orchestration (ECS, EKS, or similar) ML Model Deployment: Demonstrated ability to deploy and serve ML models in production environments Asynchronous Programming: Experience with async/await patterns, message queues, and event-driven architectures Testing: Strong understanding of unit testing, integration testing, and test automation practices Communication: Ability to collaborate effectively with cross-functional teams and translate business requirements into technical solutions Preferred Qualifications: 3+ years' experience with PySpark and distributed computing frameworks Experience with Databricks, Collibra, and Starburst Knowledge of MLOps tools and frameworks (MLflow, Kubeflow, SageMaker Pipelines) Experience with Infrastructure as Code (Terraform, CloudFormation) Familiarity with streaming data platforms (Kafka, Kinesis) Experience building end-to-end data pipelines and ETL processes Understanding of microservices architecture and service mesh patterns Experience with observability tools (DataDog, New Relic, CloudWatch) Background working in regulated industries (financial services, healthcare, government) Knowledge of data governance, lineage, and compliance frameworks Experience with GraphQL API design and implementation Familiarity with econometric models and statistical computing environments (R, Stata) Stefanini takes pride in hiring top talent and developing relationships with our future employees. Our talent acquisition teams will never make an offer of employment without having a phone conversation with you. Those face-to-face conversations will involve a description of the job for which you have applied. We also speak with you about the process including interviews and job offers. About Stefanini Group: The Stefanini Group is a global provider of offshore, onshore and near shore outsourcing, IT digital consulting, systems integration, application, and strategic staffing services to Fortune 1000 enterprises around the world. Our presence is in countries like the Americas, Europe, Africa, and Asia, and more than four hundred clients across a broad spectrum of markets, including financial services, manufacturing, telecommunications, chemical services, technology, public sector, and utilities. Stefanini is a CMM level 5, IT consulting company with a global presence. We are CMM Level 5 company

Requirements

Bachelor's degree in Computer Science, Software Engineering, Information Systems, or related technical field, or equivalent experience
4+ years in backend development, with at least 2+ years building and deploying AI/ML or LLM-powered services.
Strong Python proficiency; experience with frameworks like FastAPI, Flask, or Django
Proven experience designing and implementing RESTful APIs, with understanding of API security and best practices
Hands-on experience building backend services for LLM applications, including prompt orchestration, RAG architectures, and AI service integration
Working knowledge of AWS services including Lambda, API Gateway, SageMaker, Bedrock, S3, and related AI/ML tools
Proficiency with both relational (PostgreSQL, MySQL) and NoSQL databases (DynamoDB, MongoDB); experience with vector databases (Pinecone, Weaviate, pgvector) preferred
Experience with Docker and container orchestration (ECS, EKS, or similar)
Demonstrated ability to deploy and serve ML models in production environments
Experience with async/await patterns, message queues, and event-driven architectures
Strong understanding of unit testing, integration testing, and test automation practices
Ability to collaborate effectively with cross-functional teams and translate business requirements into technical solutions

Nice To Haves

3+ years' experience with PySpark and distributed computing frameworks
Experience with Databricks, Collibra, and Starburst
Knowledge of MLOps tools and frameworks (MLflow, Kubeflow, SageMaker Pipelines)
Experience with Infrastructure as Code (Terraform, CloudFormation)
Familiarity with streaming data platforms (Kafka, Kinesis)
Experience building end-to-end data pipelines and ETL processes
Understanding of microservices architecture and service mesh patterns
Experience with observability tools (DataDog, New Relic, CloudWatch)
Background working in regulated industries (financial services, healthcare, government)
Knowledge of data governance, lineage, and compliance frameworks
Experience with GraphQL API design and implementation
Familiarity with econometric models and statistical computing environments (R, Stata)

Responsibilities

Design, develop, and maintain robust backend services and APIs to support AI/ML applications across the organization.
Actively participate in Agile rituals and follow Scaled Agile processes as set forth by the CDP Program team.
Deliver high-quality backend services following Safe Agile Practices
Proactively identify and resolve issues with AI services, APIs, and model serving infrastructure.
Deploy comprehensive monitoring and alerting for backend AI systems, implementing auto-remediation where possible to ensure system availability and reliability.
Employ a security-first, testing, and automation strategy, adhering to backend engineering and MLOps best practices.
Collaborate with cross-functional teams, including data scientists, front-end engineers, data engineers, and business stakeholders, to understand requirements and deliver robust backend solutions.
Keep up with the latest trends and technologies, evaluating and recommending new tools, frameworks, and architectures to improve backend AI capabilities.
Design and develop robust, scalable RESTful APIs and GraphQL services for AI/ML applications.
Build backend services for LLM-powered applications, including RAG systems, document processing pipelines, and knowledge bases.
Implement secure API endpoints for model inference, prompt orchestration, and AI service integration.
Develop asynchronous processing workflows for long-running AI tasks (document analysis, batch predictions).
Create middleware and service layers to abstract complex AI functionality for front-end consumption.
Design and implement caching strategies, rate limiting, and API versioning for production AI services.
Build event-driven architectures using message queues (SQS, SNS, EventBridge) for scalable AI workflows.
Ensure APIs meet security, authentication, and authorization requirements for Federal environments.
Implement comprehensive error handling, logging, and observability for AI services.
Build and maintain ML model serving infrastructure using AWS SageMaker, Lambda, and containerized deployments.
Integrate AWS AI services (Bedrock, Textract, Comprehend) into backend pipelines and APIs.
Develop CI/CD pipelines for automated testing, deployment, and rollback of AI services.
Implement model versioning, A/B testing frameworks, and canary deployment patterns.
Create data preprocessing and feature engineering pipelines using PySpark and Databricks.
Build orchestration workflows for multi-step AI processes (ingestion → preprocessing → inference → post-processing).
Develop monitoring and alerting systems for model performance, latency, cost, and availability.
Collaborate with data scientists to produce ML models and transition from prototype to production.
Implement vector databases and semantic search capabilities for RAG architecture.
Manage prompt templates, model configurations, and AI service parameters as code.
Optimize backend performance for high-throughput AI workloads and real-time inference.
Monitor and troubleshoot production issues including latency, errors, and cost optimization.
Implement automated remediation and self-healing capabilities for backend services.
Conduct performance testing and capacity planning for AI infrastructure
Provide technical support and act as escalation point for backend AI service issues.
Create comprehensive technical documentation for APIs, architecture patterns, and deployment procedures.
Stay current on backend technologies, AI infrastructure trends, and Federal regulatory requirements.
Collaborate with security teams to ensure compliance with data protection and privacy regulations.