Senior Data Engineer

BlastPoint

2h•$140 - $160•Remote

About The Position

BlastPoint is a B2B data analytics startup located in the East Liberty neighborhood of Pittsburgh. We give companies the power to solve business problems through discovering the humans in their data and understanding how they think. Serving diverse industries including energy, retail, finance, and transportation, BlastPoint’s software platform helps companies plan solutions to customer-facing challenges, from encouraging green behavior to managing customers’ financial stress. Founded in 2016 by Carnegie Mellon Alumni, we are a tight-knit, forward-thinking team. We are seeking a talented Senior Data Engineer to own and evolve our data processing pipeline. You'll work across a production-scale medallion architecture that ingests, transforms, and delivers customer data through a multi-stage pipeline serving clients in the utility and financial services industries. This role sits at the center of our data infrastructure - building the pipelines and tooling that power everything from daily data refreshes to ML feature engineering to platform delivery.

Requirements

Bachelor's degree in a related field like Data Engineering, Computer Science, Data Science, Math, Statistics with 3+ years of experience or 5+ years of relevant experience.
Experience designing and maintaining production ETL/ELT pipelines with proper error handling, idempotency, and monitoring.
Advanced proficiency in Python, with deep experience in Pandas and PySpark (DataFrame API, SQL, performance tuning, distributed joins).
Strong SQL skills with PostgreSQL, including query optimization, indexing strategies, and schema design.
Hands-on experience with AWS services including but not limited to: S3, Lambda, Batch, SageMaker, and StepFunctions.
Experience with PyArrow and columnar data formats (Parquet) and data lake patterns.
Strong problem-solving skills with the ability to work autonomously, make architectural decisions, and manage multiple concurrent projects.
Excellent communication skills with the ability to drive cross-functional collaboration, proactively engaging stakeholders to align on requirements and solutions.
Experience using Git for version control and repository management.
Authorized to work in the United States.

Nice To Haves

Experience with Infrastructure as Code (Terraform).
Experience implementing observability solutions (monitoring, logging, alerting) for production data pipelines.
Experience developing REST APIs with FastAPI, SQLAlchemy, and Alembic (or equivalent web frameworks and ORMs).
Understanding of MLOps.
Experience building and deploying LLM-powered agents.
Experience with Apache Iceberg or similar data lakehouse technologies.
Experience with geospatial data processing (geocoding, spatial joins).
Familiarity with React/TypeScript for contributing to internal tooling.
Understanding of CI/CD (GitHub Actions).
Experience mentoring junior engineers.
A willingness to travel domestically periodically for company events (roughly 2-4 times per year).

Responsibilities

Design, develop, and maintain our core Python ETL framework by writing reusable, well-tested modules that power data transformations across client pipelines.
Develop and optimize our automated refresh pipeline orchestrated through AWS Batch, Lambda, Step Functions, and EventBridge.
Build Python integrations with external systems (SFTP, third-party APIs, client platforms) that are robust, testable, and reusable.
Identify and eliminate manual bottlenecks in data onboarding and analysis through well-designed automation.
Build and extend internal web applications (FastAPI, SQLAlchemy, PostgreSQL) that support pipeline orchestration, client configuration, and data platform operations.
Ensure data integrity and security throughout project lifecycles.
Write efficient server-side Python code, leveraging the Pandas and PySpark DataFrame APIs for scalable data transformations and aggregations.
Optimize Spark jobs for cost and performance at scale.
Debug complex data quality issues across client pipelines.
Mentor junior engineers on data transformation patterns, aggregation frameworks, and best practices.
Contribute to our internal metadata management application (FastAPI backend, React/TypeScript frontend).
Build API endpoints, write database migrations, and occasionally develop frontend features.
Maintain the metadata layer that drives pipeline configuration and data governance.