Vice President, Data Engineering

Ares Management CorporationNew York, NY
5d

About The Position

Over the last 20 years, Ares’ success has been driven by our people and our culture. Today, our team is guided by our core values – Collaborative, Responsible, Entrepreneurial, Self-Aware, Trustworthy – and our purpose to be a catalyst for shared prosperity and a better future. Through our recruitment, career development and employee-focused programming, we are committed to fostering a welcoming and inclusive work environment where high-performance talent of diverse backgrounds, experiences, and perspectives can build careers within this exciting and growing industry. Job Description Position Overview We seek a VP Data Engineer to own critical data pipelines and establish architectural patterns for our Databricks-on-Azure data platform. This is an opportunity to architect scalable ETL/ELT patterns, establish best practices for Databricks/Spark/Delta Lake development, and design systems that handle both structured and unstructured data at scale. You will work closely with the Principal Head of Data Engineering and VP Staff Data Engineer to build pipelines that power AI-ready infrastructure. Your code and patterns become the template for how the team builds data systems.

Requirements

  • 6-9 years of data engineering experience with 2+ years at senior level or equivalent complexity
  • Expert-level proficiency in Databricks/Delta Lake: notebook development, SQL, Spark, performance tuning
  • Advanced SQL expertise: complex joins, window functions, CTEs, query optimization
  • Strong Python proficiency: PySpark, pandas, data validation libraries
  • Proven experience building ETL/ELT pipelines at scale (100GB+ datasets, multi-source ingestion)
  • Deep understanding of Delta Lake: transactions, ACID properties, schema evolution, merge operations
  • Experience with Azure cloud services: ADLS, Azure SQL, Event Hubs, blob storage, Azure Key Vault
  • Demonstrated experience with document and unstructured data processing
  • Experience with data orchestration tools (Prefect, Airflow, Databricks Workflows) and building robust error handling
  • Ability to mentor other engineers and lead by example
  • Comfort with greenfield projects and establishing best practices from scratch

Nice To Haves

  • Production experience with Databricks Unity Catalog and governance features
  • Experience with Databricks SQL and serverless compute
  • Hands-on experience with document extraction: PDFs, forms, OCR, Table extraction
  • Familiarity with Azure AI Services: Form Recognizer, Document Intelligence, Cognitive Search
  • Experience with NLP libraries (spaCy, NLTK) and text preprocessing at scale
  • Experience in financial services or PE environments
  • Familiarity with dbt for transformation orchestration
  • Databricks certifications or demonstrated expertise

Responsibilities

  • Databricks-on-Azure Architecture & Optimization Design and build complex Spark SQL and Python-based ETL/ELT pipelines in Databricks that handle large-scale data transformations
  • Master Delta Lake architecture: table design, partitioning strategies, file organization, Z-ordering for query optimization
  • Optimize Databricks cluster configurations: choose between interactive, job, and serverless compute based on workload; tune executor memory, shuffle partitions, and parallelism
  • Implement cost-efficient patterns: predictive pushdown, broadcast joins, caching strategies; right-size clusters and use spot instances for non-critical jobs
  • Design data quality frameworks within Databricks: schema validation, null handling, duplicate detection, completeness checks
  • Azure Integration & Data Movement Design ADLS (Azure Data Lake Storage) layouts: bronze/silver/gold medallion architecture, folder structures, retention policies
  • Optimize Azure data movement: leverage Delta Live Tables (DLT), Databricks SQL, and managed ingest patterns
  • Design Databricks workspace integration: configure Azure AD authentication, scoped API tokens, cluster policies
  • Implement data governance via Databricks Unity Catalog: manage catalogs, schemas, table ACLs, lineage tracking
  • Document & Unstructured Data Processing Design pipelines for document processing: ingest PDFs, Word docs, and other formats from blob storage into Databricks
  • Implement text extraction pipelines: use Databricks-native libraries, Azure AI Services (Form Recognizer, Document Intelligence)
  • Build structured extraction from unstructured data: extract financial tables, key metrics, and entities from deal documents and financial statements
  • Build preprocessing pipelines for NLP and LLM consumption: tokenization, chunking, metadata extraction, quality scoring
  • Manage document versioning and lineage: track source documents, extraction versions, and quality metrics
  • Core Pipeline Development Own end-to-end design and implementation of critical pipelines supporting investment teams, ops/finance, and client / IR teams
  • Establish patterns for error handling, logging, and monitoring within Databricks jobs
  • Implement idempotent pipeline design: support re-runs, backfills, and late-arriving data
  • Design incremental data loading: leverage Delta Lake's merge operations, CDC patterns, and change tracking
  • Partner on schema design and dimensional modeling of enterprise data sets
  • Mentorship & Standards Mentor junior and mid-level engineers through code review, pair programming, and design guidance
  • Establish Databricks/Spark best practices: naming conventions, notebook organization, cluster policies, testing patterns
  • Create reusable libraries and utilities: custom Spark functions, data quality frameworks, common transformations
  • Own code quality; your code is the reference for how the team builds Document patterns and best practices; maintain internal confluences/knowledge base
  • Troubleshooting & Optimization Debug complex Spark issues: shuffle spills, out-of-memory errors, performance bottlenecks
  • Optimize Databricks query performance: analyze execution plans, identify skew, apply optimization techniques
  • Manage cluster costs and performance: monitor job execution, identify inefficiencies, recommend cluster right-sizing
  • Lead postmortems and troubleshooting sessions for production issues

Benefits

  • Comprehensive Medical/Rx, Dental and Vision plans
  • 401(k) program with company match
  • Flexible Savings Accounts (FSA)
  • Healthcare Savings Accounts (HSA) with company contribution
  • Basic and Voluntary Life Insurance
  • Long-Term Disability (LTD) and Short-Term Disability (STD) insurance
  • Employee Assistance Program (EAP)
  • Commuter Benefits plan for parking and transit
  • Access to a world-class medical advisory team
  • A mental health app that includes coaching, therapy and psychiatry
  • A mindfulness and wellbeing app
  • Financial wellness benefit that includes access to a financial advisor
  • New parent leave
  • Reproductive and adoption assistance
  • Emergency backup care
  • Matching gift program
  • Education sponsorship program
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service