We are seeking a Senior Machine Learning Engineer / Platform Engineer to design and build a production-grade agentic workflow platform. This role sits at the intersection of LLM systems engineering, distributed platforms, and applied ML, with a strong emphasis on orchestration, reliability, and extensibility. You will be responsible for architecting and implementing agent-based workflows that integrate large language models, retrieval systems, structured knowledge, and external APIs—designed for robustness, observability, and real-world business use. Design and implement multi-agent and single-agent workflows using orchestration patterns and tools, context engineering, memory management, and guardrail strategies. Design RAG pipelines incorporating vector search, hybrid retrieval, and citation tracking. Implement knowledge graph–backed reasoning, including ontologies, entity resolution and graph-based context construction. Design evaluation frameworks for agent task completion correctness, quality, cost, and latency. Develop and deploy machine learning models, focusing on production readiness, scalability, and performance. Collaborate with data scientists to transition experimental models into robust, production-grade applications. Integrate with collaboration platforms (e.g., Teams, alerting systems) for intelligent distribution of insights. Implement and manage CI/CD pipelines to automate deployment, testing, and monitoring of models. Architect and deploy systems on AWS, leveraging compute, storage and security services
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Mid Level
Number of Employees
1,001-5,000 employees