Adobe-posted 5 days ago
Full-time • Mid Level
Austin, TX
5,001-10,000 employees

Our Company Changing the world through digital experiences is what Adobe’s all about. We give everyone—from emerging artists to global brands—everything they need to design and deliver exceptional digital experiences! We’re passionate about empowering people to create beautiful and powerful images, videos, and apps, and transform how companies interact with customers across every screen. We’re on a mission to hire the very best and are committed to creating exceptional employee experiences where everyone is respected and has access to equal opportunity. We realize that new ideas can come from everywhere in the organization, and we know the next big idea could be yours! The Opportunity We are seeking a passionate Data Engineer who can operate at the intersection of data processing and machine learning to derive intelligence from enterprise content and help build the data backbone of brand-aware generative AI. In this role, you will design and implement data foundations that ingest and enrich structured, unstructured, and multimodal brand assets, creating a scalable and retrievable brand intelligence platform. Your work will directly help enterprises create content at scale while preserving brand identity. As part of the brand AI services team, you’ll ensure generative models reliably access timely, relevant data—driving better reasoning, personalization, and consistency across key Adobe surfaces (GenStudio for Performance Marketers, Workfront, AEM).

  • Build scalable ingestion pipelines for brand information and creative assets, ensuring freshness, reliability, and versioning.
  • Integrate and leverage systems and models from our machine learning and data science partner teams.
  • Design and maintain brand-aware data models, ontologies, and multi-modal graphs to support context linking and rich retrievals.
  • Implement hybrid storage and retrieval strategies across vector databases, graph databases, and search engines, optimizing for precision and latency.
  • Develop metadata enrichment pipelines to enhance semantic search, personalization, and optimization for RAG-based conversational systems.
  • Ensure data quality and observability by monitoring metrics for accuracy, coverage, and timeliness; build monitoring systems to track ingestion and retrieval health.
  • Collaborate with product, ML, and information retrieval teams to align data infrastructure with creative workflows.
  • Optimize pipelines using distributed/streaming systems for scale and speed.
  • 4+ years of software development experience with 1+ year in data engineering, search relevance, or large-scale systems for conversational experiences.
  • Expertise in building reliable and innovative ETL pipelines for heterogeneous data.
  • Experience with distributed data frameworks (Spark, Flink) and streaming platforms (Kafka).
  • Proficiency in Python (preferred) or Java/Scala, with strong CS fundamentals.
  • Experience with cloud platforms (Azure, AWS, or GCP) and containerization/orchestration (Docker, Kubernetes).
  • Familiarity with modern search systems (Elastic, Vespa) and graph databases (Neo4j, TigerGraph).
  • Understanding of ML data pipelines and MLOps standards for monitoring and continuous improvement
  • Self-starter who thrives in zero-to-one environments and can make informed tradeoffs.
  • Background in information retrieval, NLP, or cognitive computing.
  • Experience developing, optimizing, and deploying data processing with Apache Spark/Dask/Ray.
  • Experience with ontologies, knowledge graphs, or semantic enrichment pipelines.
  • Degree in Computer Science, Information Systems, or related field.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service