Mid-Level Engineer, Global

Vantage Data Centers•Denver, CO

47d•$100,000 - $110,000•Hybrid

About The Position

This position will be based on-site at our office in Denver, CO. in alignment with our flexible work policy. (3 days on site required, 2 days flexible). Vantage Data Centers is seeking a Mid‑Level Data Engineer to help build, operate, and scale our enterprise data platform. This role is designed for an engineer who can operate independently, execute reliably in a fast‑paced environment, and take ownership of data pipelines and datasets with minimal ramp‑up. As part of the Data Engineering & Business Intelligence team, you will be responsible for delivering production‑ready data solutions that support analytics, reporting, and emerging AI‑enabled use cases. You will work closely with senior data engineers and business partners, but this role assumes a self‑starter mindset with the ability to move from requirements to implementation without constant oversight. Success in this position requires comfort with ambiguity, strong execution discipline, and accountability for results.

Requirements

Bachelor’s degree in Engineering, Computer Science, Data Analytics, or a related field, or equivalent experience.
Minimum of 3–5 years of experience in data engineering or analytics engineering.
Proficiency in Python for building and maintaining data pipelines, automation, and data processing workflows, including use of PySpark.
Proficiency in SQL for querying, transformation, and analytical data processing.
Solid understanding of ETL/ELT pipelines, data transformation patterns, and data integration concepts.
Experience analyzing enterprise data sources to identify data relationships, transformations, and business rules.
Experience building solutions on the Microsoft Azure platform with exposure to services such as Azure Data Factory, Azure Synapse, Azure Data Lake Storage Gen2, and related analytics services.
Experience working with source control and CI/CD workflows using tools such as GitHub or Azure DevOps.
Working knowledge of data modeling fundamentals, including fact and dimension tables.
Strong communication and interpersonal skills with the ability to collaborate across teams in a fast‑paced environment.
Experience working in Agile development environments.
Experience using collaboration and project tracking tools such as Jira or similar tools.
Travel required is expected to be up to 10% but may increase over time as the business evolves.

Nice To Haves

Experience working with distributed data processing frameworks, including Apache Spark.
Exposure to advanced analytics or AI‑adjacent data use cases, including preparing data for machine learning or intelligent applications.
Familiarity with additional Azure services such as Azure Functions or Logic Apps in support of data workflows.
Experience supporting data platform enhancement, refactoring, or modernization initiatives.
Familiarity with data quality, reliability, and operational best practices in production environments.
Experience working in a scaling or fast‑paced organization where priorities evolve quickly.

Responsibilities

Design, build, and maintain reliable, scalable data pipelines using Python and PySpark on the Microsoft Azure data platform.
Develop and operate batch and incremental data pipelines leveraging Azure Data Factory for orchestration and Azure Data Lake Storage Gen2 as the primary data store.
Independently implement SQL- and Spark‑based transformations to produce curated datasets that support enterprise reporting, analytics, and downstream consumption.
Take ownership of assigned data pipelines and datasets, including monitoring, troubleshooting, and performance optimization in production environments.
Work with Azure Synapse (dedicated or serverless where applicable) to support analytical workloads and data consumption patterns.
Collaborate with business analysts and cross‑functional stakeholders to translate data requirements into practical, working data solutions.
Prepare and structure data to support advanced analytics and AI‑enabled use cases by ensuring data quality, consistency, and documentation.
Apply established data governance, security, and engineering standards to ensure compliant, maintainable, and scalable solutions.
Participate in code reviews, technical discussions, and platform improvement initiatives as an active contributor.
Proactively identify data quality issues, pipeline risks, and improvement opportunities, and communicate them clearly in a fast‑paced environment.
Develop and maintain PySpark notebooks and jobs to ingest, transform, and curate data within the enterprise data platform.
Build and modify Azure Data Factory pipelines for batch and incremental data ingestion.
Implement Spark‑based transformations that write curated datasets to Azure Data Lake Storage Gen2 using established folder structures and naming conventions.
Create and maintain SQL views and tables in Azure Synapse to support analytics and reporting use cases.
Respond to pipeline failures, data validation issues, and operational alerts.
Perform basic performance tuning of Spark jobs (e.g., partitioning, filtering, incremental logic) within established architectural patterns and standards.
Validate data outputs with business partners and address data defects or discrepancies.
Commit code using Git, follow branching standards, and participate in pull request reviews.
Update documentation for pipelines, datasets, and operational runbooks as changes are made.
Execute assigned backlog items within sprint timelines and raise risks or blockers early.
Additional duties as assigned by management

Benefits

This position is eligible for company benefits including but not limited to medical, dental, and vision coverage, life and AD&D, short and long-term disability coverage, paid time off, employee assistance, participation in a 401k program that includes company match, and many other additional voluntary benefits.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume