AI Knowledge Data Engineer

iBusiness FundingFort Lauderdale, FL
17d

About The Position

We are seeking an experienced expert AI Knowledge Data Engineer to design, implement, and scale state-of-the-art AI systems that combine large language models (LLMs), advanced retrieval techniques, cognitive memory architectures, including knowledge representation, and data fusion. In this role, you will orchestrate robust data pipelines, architect scalable training data solutions, and build the foundational knowledge bases that power next-generation AI agents. You will collaborate with cross-functional teams to ensure our systems efficiently retrieve, contextualize, and generate accurate information for diverse business applications.

Requirements

  • Bachelor’s or Master’s degree in Computer Science, Data Science, Machine Learning, or a related field.
  • Proven experience designing and scaling data pipelines and training data workflows for LLMs or similar AI systems.
  • Strong background in information retrieval systems, vector search technologies, and RAG frameworks (e.g., FAISS, Pinecone, Elasticsearch, Milvus).
  • Proficiency in programming (Python) and machine learning libraries (TensorFlow, PyTorch).
  • Experience with ontologies, knowledge graphs, and semantic technologies (RDF, OWL, SPARQL).
  • Familiarity with distributed data processing and orchestration tools (e.g., Spark, Airflow, Kubeflow).
  • Excellent analytical, problem-solving, and communication skills.
  • Ability to work collaboratively in a cross-functional, fast-paced environment.

Nice To Haves

  • Experience with LLM fine-tuning, prompt engineering, and RAG optimization.
  • Familiarity with data-centric AI principles and training data quality assessment.
  • Experience with cloud platforms and scalable storage solutions.
  • Background in cognitive memory architectures or AI agent design.

Responsibilities

  • Architect, implement, and optimize retrieval-augmented generation (RAG) workflows by integrating local LLMs (e.g., Llama) with retrieval mechanisms (vector search, Elasticsearch, FAISS, Weaviate).
  • Design, build, and maintain scalable data pipelines for ingesting, transforming, indexing, and retrieving structured and unstructured data from diverse sources.
  • Design, build, and scale addressable services and tools specifications that can be leveraged by LLMs and Agents to orchestrate workflows.
  • Orchestrate and scale training data operations, including data curation, versioning, and lineage tracking for large-scale LLM training and fine-tuning.
  • Develop and maintain ontologies, knowledge graphs, and semantic data models to structure and integrate domain-specific knowledge for improved retrieval and reasoning.
  • Implement and optimize knowledge retrieval strategies (dense/sparse retrieval, ranking algorithms) to maximize system accuracy and relevance.
  • Aggregate disparate knowledge bases and heterogeneous data into a fused approach for access to relevant contextual information.
  • Design cognitive memory systems for AI agents, enabling persistent knowledge retention and contextual awareness across interactions.
  • Collaborate with AI researchers, data scientists, and engineers to align knowledge architecture with business objectives and ensure data quality.
  • Evaluate and integrate new technologies and research advancements in LLMs, RAG, information retrieval, and knowledge representation.
  • Maintain clear and comprehensive documentation of models, pipelines, and workflows.

Benefits

  • medical
  • dental
  • vision coverage
  • 401(k) with company match
  • paid time off
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service