Cloud Data Engineer

GOVXNew York, NY
22dRemote

About The Position

The Cloud Data Engineer is part of a Data team that is responsible for supporting, modernizing, and transforming our data and reporting capabilities across our products by implementing a new modernized data architecture. The position will be responsible for day-to-day data collection, transportation, maintenance/curation, and access to the GOVX corporate data assets. The Cloud Data Engineer will work cross-functionally across the enterprise to centralize data and standardize it for use by business reporting, machine learning, AI, data science, and other stakeholders. This position plays a critical role in increasing awareness about available data and democratizing access to it across GOVX and our data partners. This position will be under the supervision of the Business Intelligence Manager.

Requirements

  • Hands-on experience developing, debugging, and optimizing Spark notebooks for ETL and analytics in Microsoft Fabric and Azure.
  • Deep expertise in Microsoft Fabric, Dataflows Gen2, and Power BI integration.
  • Hands-on experience with Delta Lake table management, including schema evolution, versioning, and data compaction
  • Experience with Data Lakehouse and Medallion Architecture.
  • Experience with CI/CD and version control using Git.
  • Advanced SQL and NoSQL query authoring; Python and Spark scripting.
  • Proficiency with object-oriented/object function scripting languages: Python, Spark, etc.
  • Proficiency with Metadata-Driven Design and JSON.
  • Experience working with streams such as Event Hubs and Event Driven Architectures.
  • Experience building, maintaining, and optimizing ‘big data’ data pipelines, architectures, and data sets.
  • Experience cleaning, testing, and evaluating data quality from a wide variety of ingestible data sources.
  • Knowledge Microsoft Power Platform including Copilot Studio and Power Apps.
  • Strong collaboration and communication skills with business and technical teams.

Nice To Haves

  • Bachelor’s degree or equivalent experience.
  • 7+ years of proven experience deploying and maintaining always-on data services.
  • 2+ years building and maintaining Spark notebook-based pipelines in Microsoft Fabric.
  • 2+ years of experience working with Microsoft Fabric
  • 2+ years of experience working with Microsoft Azure
  • 2+ years of experience with Delta Lake.
  • 1+ years of experience with SQL Server.

Responsibilities

  • Supporting and modernizing existing data integrations.
  • Crafting and maintaining efficient data pipeline architecture.
  • Assembling large, complex data sets that meet business requirements.
  • Create and maintain optimal data pipeline/flow architecture.
  • Identifying, crafting, and implementing internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
  • Partner with business analysts, data scientists, and IT teams to translate business requirements into scalable data solutions using Fabric, Spark, using Delta Lake. Develop Spark notebooks for both batch and streaming ETL pipelines leveraging Delta Lake.
  • Implement and optimize Delta Lake features including schema enforcement, schema evolution, and time travel for robust data management. Optimize Delta Lake tables for performance using Z-ordering, compaction, and partitioning strategies.
  • Working with the team to strive for clean and meaningful data, and greater functionality and flexibility within the team’s data systems.
  • Design processes supporting data transformation, data structures, metadata, dependency, and workload management.

Benefits

  • Paid Time Off
  • Paid Sick Leave
  • Paid Holidays
  • Competitive Medical, Dental, Vision, Short Term Disability, and Life Insurance
  • 401(k) plan with discretionary match available
  • Flexible Spending Account (FSA)
  • Health Savings Account (HSA)
  • Voluntary benefits including Critical Illness, Group Accident, and Voluntary Life
  • Employee Referral Program
  • Exposure to a growing ecommerce company
  • Discounts on the GOVX website
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service