Database Engineer

Wavelo
Remote

About The Position

We are looking for a highly skilled Database Reliability Engineer (DBRE) with deep expertise in PostgreSQL at scale. In this role, you will design, operationalize, and optimize the data persistence layer that powers large-scale, mission-critical systems. You’ll work closely with SRE, Platform, and Engineering teams to ensure performance, reliability, automation, and operational excellence across the database environment. This is a hands-on engineering role focused on building resilient data infrastructure—well beyond traditional database administration. This role is a remote position open to applicants based in Canada.

Requirements

  • Deep understanding of PostgreSQL internals: MVCC, WAL processing, vacuum behavior, locking, query planning
  • Experience designing and operating highly available database clusters with automated failover
  • Strong performance tuning skills (query optimization, indexing, workload tuning)
  • Ability to diagnose database and system issues: Query plans, I/O, memory usage, WAL growth, table/index bloat
  • Experience with backup and recovery strategies: Point-in-time recovery (PITR), durability planning
  • Familiarity with observability and monitoring: Metrics, alerting, and performance dashboards (Grafana)
  • Understanding of distributed systems concepts: Service discovery, consensus (e.g., Consul)
  • Strong Linux systems knowledge (performance tuning, resource management)
  • Experience with scripting and infrastructure-as-code automation
  • Strong troubleshooting and problem-solving skills in production environments
  • Knowledge of: Security, compliance, encryption, auditing, access control
  • Ability to work independently in high-availability, production-critical systems
  • Familiarity with AI-assisted tools (e.g., Claude, Windsurf, GitHub Copilot)
  • 7+ years of hands-on PostgreSQL experience in large-scale, high-volume production environments
  • Strong expertise in PostgreSQL internals: WAL, MVCC, vacuum tuning, query planner, indexing, logical replication
  • Advanced SQL and strong schema design and query optimization skills
  • Solid experience with Linux systems and networking fundamentals
  • Experience building automation using Go or Python
  • Experience with monitoring tools such as: Prometheus, Grafana, Datadog, PMM, pg_stat_statements

Nice To Haves

  • Experience with connection pooling and load balancing: PgBouncer, HAProxy
  • Experience with high-availability solutions: Patroni or similar tools
  • Exposure to event streaming and CDC: Kafka, Debezium
  • Experience supporting 24/7 production environments
  • Experience with PostgreSQL backup tools: Barman, pgBackRest, WAL-G
  • Familiarity with Traefik or similar infrastructure components

Responsibilities

  • Design, implement, and operate highly available PostgreSQL clusters (physical/logical replication, sharding, partitioning, failover automation)
  • Optimize query performance and indexing strategies
  • Perform capacity planning, growth forecasting, and workload modeling
  • Own high-availability strategies, including: Automatic failover Multi-region deployments Disaster recovery
  • Build and maintain automation for: Provisioning and configuration Backups and recovery Failovers Vacuum tuning Schema management
  • Use tools such as Terraform, Ansible/SaltStack, Bash, Python, etc.
  • Develop monitoring and alerting systems for PostgreSQL clusters
  • Lead response during database incidents (e.g., performance regressions, replication lag, deadlocks, bloat, storage failures)
  • Conduct root-cause analysis and implement long-term fixes
  • Partner with software engineers to: Review SQL queries Optimize schemas Ensure effective use of PostgreSQL features
  • Provide guidance on: Database design patterns Migrations and version upgrades Best practices
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service