Senior Database Engineer - Infra

Solidus Labs•New York, NY

12h•Remote

About The Position

Solidus Labs is a company that provides cutting-edge trade surveillance technology for financial markets, including traditional assets, prediction, and crypto markets. With over 20 years of experience in FinTech, their solutions are used by financial institutions and regulators worldwide to detect market manipulation, financial crime, and fraud. Solidus Labs is headquartered in Wall Street with offices in Singapore, Tel Aviv, and London, and they monitor over a trillion events daily for millions of entities globally. The role is for a Software Engineer with Data Engineering experience, proficient in building scalable, maintainable, and monitored data pipelines on cloud environments. As an ambitious start-up, Solidus Labs values independence, accountability, organization, a self-starter attitude, and a willingness to go beyond official scope while maintaining focus on goals and the big picture.

Requirements

BSc. in Computer Sciences.
Strong background as a software engineer with at least 5+ years of hands-on experience with Java, Rust or Python.
8+ years in data engineering and data pipeline development on high-volume, low-latency production environments.
Experience working in low-latency, real-time systems processing billions of events a day.
Deep, hands-on ClickHouse expertise, including cluster architecture, table engine selection, replication, sharding, and query optimization.
Proficiency across the broader data engineering stack: Apache Kafka, Spark, Airflow, Kubernetes, Redis, Snowflake, and caching technologies.
Expert-level SQL and query optimization skills, with a strong emphasis on ClickHouse-specific patterns (materialized views, projections, TTLs, and merge tree tuning).
Experience with monitoring and observability tooling (Prometheus, Grafana, or similar), with the ability to define and own operational health metrics for a ClickHouse deployment.
Curiosity, ability to work independently, and a track record of proactively identifying and driving solutions.
Excellent verbal and written communication skills, including the ability to coach and influence engineers across teams in a remote environment.

Nice To Haves

Experience engaging with the ClickHouse vendor team or community is a strong plus.

Responsibilities

Design and optimize the ClickHouse data layer, including table engines, partition strategies, materialized views, and storage policies, for high performance at billions-of-events scale.
Own ClickHouse clusters sizing, topology decisions, and capacity planning for both real-time ingestion and T+1 batch workloads, balancing cost, latency, and throughput.
Drive data reliability and deduplication strategies within ClickHouse using engine-level features and pipeline-level controls to ensure data completeness and consistency.
Establish and continuously improve monitoring, alerting, and observability for the ClickHouse layer, covering replication health, merge performance, query latency, and resource utilization.
Serve as the internal ClickHouse authority, coaching engineering teams on query optimization, data modeling best practices, and efficient use of ClickHouse constructs.
Act as the primary liaison with the ClickHouse vendor team for triaging issues, incorporating feedback, evaluating new features, and translating guidance into actionable improvements.
Collaborate with downstream consumers (analytics, ML, product) to understand access patterns and refine data storage and serving for improved query performance, schema design, and data formats.
Define and enforce schema versioning and governance standards within the ClickHouse environment to ensure schema evolution does not compromise pipeline reliability or consumer compatibility.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume