Ampcus-posted about 2 months ago
Full-time • Mid Level
Hybrid • Reston, VA
1,001-5,000 employees

We are seeking a highly skilled and motivated Data Engineer to join our growing AI/ML team. This role is ideal for someone passionate about building scalable data pipelines, enabling machine learning workflows, and integrating cutting-edge Large Models (LLMs) into production systems. You will work closely with data scientists, ML engineers, and software developers to design and implement robust data infrastructure that powers intelligent applications.

  • Design, build, and maintain scalable and efficient ETL/ELT pipelines using Python and modern data engineering tools.
  • Collaborate with AI/ML teams to support model training, evaluation, and deployment workflows.
  • Develop and optimize data schemas, storage solutions, and APIs for structured and unstructured data.
  • Integrate and fine-tune LLMs (e.g., OpenAI, Hugging Face Transformers) for various business use cases.
  • Ensure data quality, governance, andpliance across all data systems.
  • Monitor and troubleshoot data workflows and model performance in production.
  • Automate data ingestion from diverse sources including APIs, databases, and cloud storage.
  • Contribute to the development of internal tools and libraries for ML experimentation and deployment.
  • Bachelor's or Master's degree in Computer Science, Data Engineering, or related field.
  • 3 years of experience in data engineering or backend development.
  • Strong proficiency in Python and libraries such as Pandas, NumPy, PySpark, etc.
  • Experience with AI/ML frameworks (e.g., TensorFlow, PyTorch, Scikit-learn).
  • Hands-on experience with LLMs and NLP tools (e.g., LangChain, Hugging Face, OpenAI API).
  • Proficiency in SQL and working with relational and NoSQL databases.
  • Familiarity with cloud platforms (AWS, GCP, Azure) and containerization (Docker, Kubernetes).
  • Knowledge of CI/CD pipelines and version control (Git).
  • Experience with MLOps tools (MLflow, Airflow, Kubeflow).
  • Understanding of data privacy and security best practices.
  • Exposure to vector databases (e.g., Pinecone, FAISS, Weaviate).
  • Experience with real-time data processing (Kafka, Flink).
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service