Senior Database Engineer - Platform Engineering

IntegriChainPhiladelphia, PA
Hybrid

About The Position

Join our DevOps Engineering team as a Senior Database Engineer to design, build, and engineer cloud-native database platforms across a modern, multi-engine data stack. This is an engineering role, not a DBA role, focused on building scalable systems, writing infrastructure-as-code, and embedding databases into software delivery pipelines. You'll work closely with DevOps and Product Engineering to build high-performing data infrastructure that supports critical applications and analytics. You will own and evolve a diverse ecosystem spanning AWS RDS, Aurora, DynamoDB, Redshift, Azure SQL, PostgreSQL, Snowflake, and NoSQL engines, integrating AI-driven automation and MLOps-ready data foundations to support critical applications and machine learning workflows.

Requirements

  • 7+ years of experience in database platform engineering, data engineering, or cloud infrastructure engineering in production environments.
  • Proven experience as a lead or senior engineer on multi-engine database platforms spanning both SQL and NoSQL workloads — with a software engineering, not administration, mindset.
  • Strong track record of designing and operating data platforms at scale in AWS environments, with databases managed as code from day one.
  • Deep hands-on expertise with AWS RDS (PostgreSQL, MySQL, Oracle), Aurora (Serverless v2, Global Database), and RDS Proxy.
  • Production experience with DynamoDB: single-table design, GSI/LSI strategy, Streams, DAX, and capacity planning.
  • Working knowledge of AWS Redshift, Glue, Lake Formation, Kinesis, MSK, and EventBridge for pipeline and lakehouse architectures.
  • Strong hands-on Snowflake experience: performance tuning (clustering, materialized views, query profiling), cost optimization (warehouse sizing, auto-suspend, credits), security (RBAC, dynamic masking, network policies), and data sharing.
  • Deep SQL expertise across multiple engines (PostgreSQL, T-SQL, Snowflake SQL, DynamoDB PartiQL).
  • Strong understanding of Medallion Architecture, semantic layers, and analytics engineering best practices.
  • Proven NoSQL data modeling: DynamoDB single-table design, document store schema design, and search index architecture.
  • Experience building and operating advanced ELT/ETL pipelines using dbt, AWS Glue, Airflow, or similar orchestration frameworks.
  • Hands-on experience with streaming ingestion using Kinesis, MSK (Kafka), or equivalent event-driven technologies.
  • Familiarity with CDC patterns and tools (DMS, Debezium) for cross-engine data synchronization.
  • Understanding of ML pipeline requirements: feature engineering, training dataset preparation, model versioning, and inference data patterns.
  • Exposure to AWS SageMaker, Bedrock, or equivalent ML platforms from a data infrastructure perspective.
  • Awareness of vector databases and embedding-based retrieval (pgvector, OpenSearch k-NN) is a strong plus.
  • Proficiency with Terraform for database and cloud infrastructure as code; AWS CDK experience is a plus.
  • Proficiency with Python (boto3, SQLAlchemy, pandas) and SQL for data transformation, automation, and tooling.
  • Experience integrating database workflows into CI/CD pipelines using GitHub Actions, CodePipeline, or similar.
  • Candidates must reside in Pennsylvania, New Jersey, or New York and be within a reasonable travel distance of our Philadelphia office, as regular in-person collaboration is required.

Nice To Haves

  • Familiarity with Azure SQL, Azure Data Factory, or Azure Synapse is a plus.
  • AWS certifications: AWS Database Specialty, AWS Solutions Architect, AWS Data Engineer Associate.
  • Snowflake SnowPro Core or Advanced Data Engineer certification.
  • Experience with Apache Iceberg, Delta Lake, or Hudi for open table format lakehouse architectures.
  • Hands-on experience with SageMaker Feature Store, Model Registry, or MLflow for MLOps workflows.
  • Familiarity with data observability platforms (Monte Carlo, Bigeye) or custom observability with Great Expectations / dbt tests.
  • Experience with graph databases (Neptune) or time-series databases (Timestream) in AWS.
  • Exposure to Databricks on AWS or Azure for unified data and AI workloads

Responsibilities

  • Design, build, and engineer hybrid data solutions spanning relational (PostgreSQL, Aurora, RDS, Azure SQL), columnar (Redshift, Snowflake), and NoSQL (DynamoDB, DocumentDB, OpenSearch) engines — selecting the right engine per workload.
  • Architect cloud-native data lakehouse platforms on AWS using S3, Lake Formation, Glue, and open formats (Apache Iceberg, Delta Lake, Parquet), with Azure Data Lake as a secondary target.
  • Implement and manage Medallion Architecture (Bronze / Silver / Gold) patterns to support raw ingestion, curated analytics, and business-ready datasets.
  • Build and optimize hybrid data platforms spanning operational databases (PostgreSQL / RDS / Aurora / DynamoDB) and analytical systems (Snowflake / Redshift).
  • Develop and maintain semantic layers and analytics models to enable consistent, reusable metrics across BI, analytics, and AI use cases.
  • Engineer efficient data models, ETL/ELT pipelines, and query performance tuning for analytical and transactional workloads.
  • Engineer replication topologies, partitioning strategies, and data lifecycle automation as code — not manual DBA operations.
  • Build automated schema migration pipelines (Flyway/Liquibase) and data versioning workflows integrated into CI/CD replacing manual schema change management.
  • Design and implement API-first data access patterns, enabling engineering teams to interact with databases through well-defined, versioned interfaces rather than direct connection strings.
  • Engineer ELT/ETL pipelines using AWS-native services (Glue, Kinesis, MSK, Step Functions, EventBridge) and modern tooling (dbt, Airflow) for batch, micro-batch, and near-real-time workloads.
  • Build streaming data pipelines using AWS Kinesis Data Streams, Kinesis Firehose, and MSK (Managed Kafka) for event-driven, low-latency ingestion across multiple database targets.
  • Implement data quality checks, schema enforcement, lineage, and observability across pipelines.
  • Optimize performance, cost, and scalability across ingestion, transformation, and consumption layers.
  • Implement change data capture (CDC) using AWS DMS, Debezium, or native engine features to synchronize data across SQL, NoSQL, and analytical systems.
  • Design and optimize DynamoDB schemas using single-table design patterns, GSIs, LSIs, and DynamoDB Streams for event-driven architectures.
  • Architect DocumentDB (MongoDB-compatible) clusters for document workloads requiring flexible schema and hierarchical data models.
  • Build and manage OpenSearch / ElasticSearch clusters for full-text search, log analytics, and observability use cases.
  • Evaluate and recommend the right NoSQL engine (DynamoDB vs DocumentDB vs OpenSearch vs ElastiCache) based on access patterns, latency, and cost profile.
  • Implement TTL policies, DynamoDB Accelerator (DAX), and ElastiCache (Redis/Memcached) for high-throughput caching layers.
  • Apply AI and ML techniques to data architecture and operations, including intelligent data quality validation, anomaly detection, schema drift detection, and query workload pattern analysis — using AWS SageMaker and Amazon Bedrock.
  • Design and build ML-ready data foundations: SageMaker Feature Store, training dataset pipelines, experiment tracking, and inference data pipelines using AWS-native MLOps services.
  • Integrate LLM capabilities via Amazon Bedrock for AI-assisted data documentation, query generation, lineage summarization, and automated data cataloging.
  • Implement vector database solutions (pgvector on Aurora/RDS, OpenSearch k-NN) to support AI similarity search and retrieval-augmented generation (RAG) use cases.
  • Build AI-powered observability using ML-driven anomaly detection on pipeline metrics, query performance trends, and data quality SLAs.
  • Build and manage all data infrastructure as code using Terraform and AWS CDK — covering RDS, Aurora, DynamoDB, Redshift, Glue, MSK, Kinesis, Snowflake, and supporting IAM/networking components.
  • Integrate database changes into CI/CD pipelines (GitHub Actions, AWS CodePipeline) with automated schema testing, data contract validation, deployment, and rollback.
  • Develop internal platform tooling using Python, SQL, and AWS SDK (boto3) — building self-service capabilities that allow engineers to provision governed database environments on demand.
  • Implement database-as-code practices: automated schema migrations, snapshot/restore testing pipelines, and environment clone automation — eliminating manual DBA provisioning tasks.
  • Build and publish internal data platform APIs and SDKs that abstract database complexity from application teams.
  • Engineer enterprise-grade data governance across all engines: RBAC, column/row-level security, field-level encryption, dynamic data masking, and comprehensive audit logging, implemented as code, not manual configuration.
  • Define and enforce data contracts and ownership using AWS Lake Formation, Glue Data Catalog, and Snowflake governance — versioned and managed in source control.
  • Partner with Security and Compliance teams to ensure audit readiness and regulatory alignment (SOC 2, HIPAA, GDPR where applicable).
  • Manage AWS IAM policies, KMS encryption, VPC security groups, and private endpoints (PrivateLink, VPC Endpoints) for least-privilege access and network isolation.
  • Implement secrets management using AWS Secrets Manager and Parameter Store with automated credential rotation for all database engines.

Benefits

  • Excellent and affordable medical benefits + non-medical perks including Student Loan Reimbursement, Flexible Paid Time Off and Paid Parental Leave
  • 401(k) Plan with a Company Match to prepare for your future
  • Robust Learning & Development opportunities including over 700+ development courses free to all employees
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service