Data Engineer-Marketing Technology

Foxit•Alpharetta, GA

About The Position

Foxit is seeking an experienced Data Engineer to manage the data pipelines that support its go-to-market systems. This role, while aligned with marketing priorities, operates within the Business Applications & Data Analytics team, adhering to established development standards, architecture patterns, and code review processes to ensure pipeline consistency and maintainability within the broader data platform. The primary focus will be on marketing-related data needs, collaborating with demand generation, product marketing, sales operations, and digital teams to understand requirements and build/maintain the necessary pipelines and integrations. This is a hands-on role involving the construction and operation of data infrastructure connecting marketing automation (HubSpot), CRM (Salesforce), data warehouse (Databricks), licensing, and payment systems. The work will directly enable marketing's capabilities in audience segmentation, attribution measurement, and data-driven campaign execution at scale.

Requirements

5+ years of experience in data engineering, with hands-on pipeline development and production operations.
Strong proficiency in SQL and Python/PySpark for data pipeline development.
Experience building and maintaining ETL/ELT pipelines using Databricks, dbt, Airflow, Azure Data Factory, or equivalent.
Hands-on experience with cloud data platforms - Databricks, Snowflake, BigQuery, or Redshift.
Solid understanding of dimensional data modeling - fact tables, dimension tables, schema design, and data warehouse concepts.
Experience with medallion or layered data architectures (raw → cleansed → business-ready), Kimball-style star schemas, and one-big-table approaches to data modeling.
Working knowledge of API integration patterns - REST, webhooks, OAuth, batch sync architectures.
Experience with CRM platforms (Salesforce, HubSpot, or similar), marketing automation systems, and CPQ/quoting tools (DealHub or similar).
Bachelor's degree in Computer Science, Information Systems, or equivalent industry experience.

Nice To Haves

Experience with Databricks (Delta Lake, PySpark, Unity Catalog).
Familiarity with HubSpot APIs and data model.
Experience with identity resolution and customer data deduplication across multiple source systems.
Exposure to marketing data concepts — lead scoring, audience segmentation, campaign attribution, lifecycle stages.
Experience with Azure cloud services (Azure Functions, Azure DevOps, Azure Data Factory).
Knowledge of data security and privacy practices, particularly regarding PII handling.
Experience with code review processes and development standards compliance in a collaborative data engineering team.

Responsibilities

Design, build, and maintain ETL/ELT pipelines, optimizing the existing medallion architecture (Bronze → Silver → Gold) to transfer data between source systems (Salesforce CRM, HubSpot, NetSuite, Stripe, DealHub, LMS) and the Databricks data warehouse.
Develop pipelines using PySpark and SQL within Databricks notebooks, adhering to established development standards for naming, project structure, and layer-appropriate transformations.
Manage the data sync layer between Databricks and HubSpot, including inbound enrichment flows (license status, renewal dates, subscription state, firmographic data) and outbound marketing engagement data (email events, workflow enrollment, lifecycle changes).
Build and maintain Exchange layer pipelines to curate data for external system consumption, ensuring proper formatting and validation to meet target system requirements.
Construct and maintain scheduled batch jobs and event-driven integrations utilizing APIs (REST, webhooks, OAuth).
Monitor pipeline health, implement alerting for failures and data quality degradation, and manage incident response for sync issues.
Maintain documentation for data flows, integration architecture, and troubleshooting runbooks.
Build and maintain dimensional models in Databricks (fact tables, dimension tables, bridge tables) according to data warehouse object type definitions and naming standards.
Collaborate with stakeholders and data analysts to create curated, business-ready tables and datamarts with applied business logic, KPI calculations, and aggregations optimized for analytics and campaign activation.
Implement identity resolution and deduplication logic to generate unified customer profiles from multiple source systems.
Establish data validation rules, quality checks, and monitoring to ensure the accuracy and freshness of data flowing into marketing systems.
Normalize disparate data sources into clean, centralized schemas with proper type enforcement, deduplication, and null handling.
Ensure data infrastructure supports audience segmentation, including firmographic, behavioral, and engagement signals.
Build the data layer for lifecycle marketing, enabling triggered campaigns, dynamic journey branching, and personalization based on enriched customer profiles.
Support marketing and demand generation teams by providing reliable, accessible data for audience targeting in HubSpot.
Maintain data flows for email deliverability, subscription management, and suppression list synchronization.
Build and maintain API integrations between marketing, sales, and operational systems using Python and SQL.
Implement field-level transformation logic, sync orchestration, and error handling for system-to-system data flows.
Support website form and lead capture data flows, ensuring a clean handoff from web properties into HubSpot and Databricks.
Collaborate with third-party enrichment providers (firmographic, intent, technographic) to integrate enrichment data into automated workflows.
Build and maintain the data infrastructure supporting campaign attribution, channel performance analysis, and funnel reporting.
Ensure accurate data for conversion analytics, lead source tracking, and marketing ROI measurement.
Support centralized reporting by routing marketing engagement data back into Databricks for cross-functional analysis.