About the job Sifflet connects to many different data sources: data warehouses (Google BigQuery, Snowflake, AWS Redshift…), business intelligence visualisation solutions (Looker, PowerBI, AWS QuickSight…), transformation/ETL tools (dbt, Fivetran, Airflow…)… For each of these data sources, we need to support all Sifflet features (catalog, lineage, monitoring…). As each integration requires deep knowledge about the API, data model, and behaviour of each data source, Sifflet has a team dedicated to building these integrations. As a member of this team, you will: Design and implement new integrations with data products. This often requires using and researching how each data source behaves, and then think hard about how to model it within the Sifflet platform. Make the necessary changes to architecture and implementation to scale our data ingestion engine - some of our customers connect Sifflet to really large instances. Add support for completely new integration types - which entails defining how they will be displayed and integrated within the Sifflet application. Lead technical improvements to our codebase and architecture: integrating with many external services naturally results in many challenges regarding modularization, testability, and reliability. Help improve the team standards and processes, both around technical decisions and product design. Some projects you could be working on Build a new model for our lineage capabilities, seamlessly merging data from various sources (such as query logs processed by our in-house SQL parser, data warehouses lineage API, or dbt models) into an easy-to-query model used as a source for both automated capabilities (such as root cause analysis) and UI elements (lineage graph). Optimize the queries issued by our ingestion engine to reduce the cost incurred by customers when monitoring their datasources with Sifflet. Fetch query history from all sources, and use it as an input for automated root cause analysis. Our stack Applications written in (modern) Java, to tap into the huge data ecosystem offered by this language; Spring Boot 3. Other teams at Sifflet use Typescript + Vue.js (frontend) or Python. You may need to write small chunks of code in these languages too. Infrastructure: Kubernetes (AWS EKS clusters), MySQL (on AWS RDS), Temporal for job orchestration. Plus a few supporting services: Gitlab CI, Prometheus/Loki/Grafana, Sentry… While not directly part of our stack, expect to gain a lot of knowledge on many products in the modern data ecosystem. The subtleties of BigQuery or Snowflake will soon be very familiar to you.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Mid Level
Education Level
No Education Listed
Number of Employees
1-10 employees