About The Position

We are seeking a Senior Machine Learning Engineer / Platform Engineer to design and build a production-grade agentic workflow platform. This role sits at the intersection of LLM systems engineering, distributed platforms, and applied ML, with a strong emphasis on orchestration, reliability, and extensibility. You will be responsible for architecting and implementing agent-based workflows that integrate large language models, retrieval systems, structured knowledge, and external APIs—designed for robustness, observability, and real-world business use.

Requirements

  • Bachelor’s or master’s degree in computer science, Engineering, or related field.
  • 6+ years of experience in software engineering, ML engineering, or platform engineering.
  • Strong proficiency in writing production-grade Python, and experience with Claude Code or Cursor.
  • Hands-on experience with LLM-based systems, including: LangChain / LangGraph MCP Langsmith Claude or comparable frontier models AWS AgentCore or comparable agentic frameworks
  • Solid understanding of RAG architectures, embeddings, and vector search.
  • Experience designing and consuming APIs (REST and/or async/event-driven).
  • Strong cloud engineering experience on AWS.
  • Knowledge of how to fine-tune frontier models to specific domain knowledge
  • Experience deploying traditional machine learning models into production environments using MLOps tools and best practices.
  • Knowledge of distributed systems, large-scale model optimization, and API development.
  • Exceptional ability to work on a team – especially a dynamic, innovative “tiger team” developing early stage PoC systems.
  • Strong understanding of container orchestration and cloud-native application design.
  • Ability to work in dynamic environments, handling rapid experimentation and iterative development.

Nice To Haves

  • Experience with distillation, quantization and small language models is a plus
  • Personal Characteristics A self-motivated individual who thrives on seeing the results of their work and its impact on the business
  • Strong communication skills, both verbally and in writing
  • A keen sense for the art of the possible
  • Proven ability to be flexible and work hard, both independently and collaboratively
  • Methodical and organized - in general, in experimental design, and in code!
  • Attention to detail with strong analytical, mathematical, and problem-solving skills
  • An interest in learning about the energy commodities space
  • Resourceful and able to think creatively and adapt in a dynamic and energetic environment
  • Team player, with an open, non-political style and a high level of personal integrity
  • Desire to be a thought-partner in a fast-growing team, and make an impact at a business that sits at the heart of the world’s energy flows

Responsibilities

  • Design and implement multi-agent and single-agent workflows using orchestration patterns and tools, context engineering, memory management, and guardrail strategies.
  • Design RAG pipelines incorporating vector search, hybrid retrieval, and citation tracking.
  • Implement knowledge graph–backed reasoning, including ontologies, entity resolution and graph-based context construction.
  • Design evaluation frameworks for agent task completion correctness, quality, cost, and latency.
  • Develop and deploy machine learning models, focusing on production readiness, scalability, and performance.
  • Collaborate with data scientists to transition experimental models into robust, production-grade applications.
  • Integrate with collaboration platforms (e.g., Teams, alerting systems) for intelligent distribution of insights.
  • Implement and manage CI/CD pipelines to automate deployment, testing, and monitoring of models.
  • Architect and deploy systems on AWS, leveraging compute, storage and security services
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service