About The Position

We’re looking for a Data Platform Reliability Engineer II to own reliability, deployments, and cost efficiency for a domain within Chewy’s Enterprise Data Systems (EDS). You’ll lead automation and adoption of the paved road - ensuring consistency, transparency, and scalability across Snowflake, dbt Cloud, AWS, and BI.

Requirements

  • BA/BS in Computer Science, Engineering, Mathematics, or a related field
  • 3–5 years of experience in data engineering, DevOps, or platform reliability.
  • Strong SQL and scripting skills via Python.
  • Hands-on experience with Snowflake, including credit monitoring, warehouse optimization, and performance tuning.
  • Experience with dbt Cloud, AWS, and Terraform for infrastructure provisioning.
  • Familiarity with CI/CD tools and environment promotion workflows.
  • Excellent analytical and problem-solving skills; strong ownership mentality.
  • Enthusiasm for automation, AI observability, and continuous improvement.

Responsibilities

  • Own domain-level reliability: run on-call, lead sev-2/3 incidents, coordinate stakeholder communications, drive RCAs, and automate the top recovery actions to reduce MTTR.
  • Define and track domain SLOs for freshness, completeness, and accuracy; deliver to SLAs through tests, monitors, and dashboards.
  • Manage dbt/Snowflake deployments and environment promotion pipelines; implement pre-deploy checks, canary, automated rollback, and change-fail rate tracking.
  • Lead Snowflake cost initiatives: credit allocation, budget tracking, optimization playbooks, and transparency reporting.
  • Author and maintain Terraform modules for domain infrastructure; ship changes via CI/CD with plans, reviews, and rollback paths.
  • Contribute to paved-road onboarding materials and guardrails; help new teams land with standard configurations and observability defaults.
  • Build AI-assisted observability views for anomaly detection, drift, and warehouse optimization.
  • Embed catalog and lineage coverage checks in deployments; enforce coverage thresholds while data stewards own certification and metric definitions.
  • Improve runbooks, reduce operational toil, and mentor Level I engineers on best practices.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service