Software Engineer, Applied AI

Airbyte•San Francisco, CA

10d•Onsite

About The Position

Airbyte is the open‑source standard for data movement. We've enabled data teams to move data from applications, APIs, unstructured sources and databases to data warehouses, lakes, and AI applications. With tens of thousands of connectors built and hundreds of thousands of companies adopting Airbyte, we've proven the economics of data integration at scale. And now Airbyte is building the frontier agentic data infrastructure, purpose-built for AI agents that need fast, accurate access to data across hundreds of sources. Our mission: make data available and actionable, everywhere. We've raised $181M from the world's top investors (Benchmark, Accel, Altimeter, Coatue, Y Combinator, etc.) and we believe in product-led growth, where we build something awesome that all our users love. We’ve raised enough capital to explore boldly, but we still choose to move quickly, stay scrappy, and experiment constantly as we find the right paths in an AI-native landscape. The Role: As a Software Engineer on our Data Replication team, you will design and build intelligent systems that dramatically improve how data moves through Airbyte. From first deployment and initial sync to ongoing execution at scale. You’ll leverage LLM-based tools, agentic workflows, and automation to accelerate connector rollout, improve sync reliability, reduce TCO (total cost of ownership), and make the data movement experience seamless for both OSS and Cloud users. This role sits at the intersection of AI systems, distributed data platforms, and developer experience. Your work will directly impact sync performance, operational excellence, and how quickly Airbyte can ship improvements across its control plane, data plane, and connector ecosystem.

Requirements

5+ years of engineering experience (backend, platform, or distributed systems) with strong proficiency in Python and/or Kotlin
Hands-on experience building or operating data pipelines, replication systems, or ETL/ELT platforms
Experience designing systems that integrate LLMs with structured data, logs, APIs, or retrieval systems
Familiarity with agentic or orchestration frameworks (e.g., LangChain, Pydantic AI, Temporal-style workflows)
Experience deploying and monitoring production systems, including LLMOps, observability, and alerting
Experience running services on Kubernetes, Helm, Terraform, and major cloud providers
Strong understanding of APIs, databases, connectors, schemas, and telemetry in distributed environments
Systems-level thinking with an emphasis on performance, reliability, cost, and scalability
A startup-ready mindset: comfortable with ambiguity, moving fast, and owning problems end-to-end
A builder’s instinct for automation, leverage, and developer experience

Nice To Haves

Experience with open-source platforms, especially in data integration or infrastructure tooling
Familiarity with Airbyte, CDKs, or connector-based architectures
Exposure to large-scale connector fleets, schema evolution, CDC, or long-running sync execution
Background in control plane/data plane architectures or internal developer platforms

Responsibilities

Build AI-driven systems for data replication and connector lifecycle management, accelerating connector development, rollout, testing, and upgrades across OSS, Enterprise, and Cloud
Design and implement agentic workflows that assist with diagnosing sync failures, schema evolution issues, performance regressions, and rollout risks across large fleets of connectors
Build connectors and frameworks with AI to scale a wide range of reliable integrations
Develop observability, anomaly detection, and automated remediation systems (ML + LLM hybrid) for data sync execution, job correctness, and CDC pipelines
Improve control plane and data plane operations by automating deployment validation, release qualification, and environment testing (AWS, GCP, local, KIND)
Own AI systems across the full lifecycle: design, prompt engineering, evaluation, deployment, monitoring, and iteration in production (LLMOps)
Partner closely with platform, infra, and product teams to embed AI-powered capabilities into Airbyte’s deployment flows, APIs, and Cloud self-serve experience
Build high-leverage internal tooling that helps Airbyte ship connector and CDK changes faster while maintaining correctness, performance, and cost efficiency

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume