Starburst Senior Data Engineer- Vice President

Citi•Irving, TX

About The Position

Design & Architecture: Lead the design and architecture of Starburst-based data solutions, ensuring scalability, performance, and reliability for enterprise-level data platforms. Apache Iceberg Management: Design, implement, and manage data lakes utilizing Apache Iceberg tables with Starburst, optimizing for ACID transactions, schema evolution, time travel, and high-performance queries. Implementation & Optimization: Develop, implement, and optimize complex SQL queries and data pipelines using Starburst to extract, transform, and load (ETL/ELT) data from various sources. Performance Tuning: Proactively monitor, troubleshoot, and fine-tune Starburst clusters and queries to ensure optimal performance, resource utilization, and cost efficiency. Data Federation: Implement and manage data federation strategies using Starburst connectors to seamlessly integrate and query data across disparate systems (e.g., Data Lakes, RDBMS, NoSQL databases, Cloud Storage). Security & Governance: Collaborate with security and governance teams to implement robust data access controls, encryption, and compliance measures within the Starburst ecosystem. Mentorship & Leadership: Provide technical leadership and mentorship to data engineers and analysts, fostering best practices in Starburst usage, data modeling, and query optimization. Strategic Planning: Contribute to the long-term data strategy and roadmap, evaluating new features, technologies, and methodologies related to Starburst and the broader data ecosystem. Collaboration: Work closely with data scientists, business analysts, application developers, and other stakeholders to understand data requirements and deliver effective solutions. Documentation: Create and maintain comprehensive technical documentation for Starburst configurations, data models, query patterns, and operational procedures. 10+ years of hands-on experience in data engineering, data warehousing, or big data analytics projects. Minimum of 3 years of direct experience with Starburst Enterprise (Trino/PrestoSQL) in a production environment, including configuration and knowledge in administration. Deep expertise in SQL, with advanced query writing and optimization skills specific to Starburst/Trino. Proven experience with Apache Iceberg, including table creation, schema evolution, time travel, and performance optimization when used with Starburst/Trino. Strong understanding of distributed systems and parallel processing architectures. Proficiency with various data connectors and their configuration within Starburst (e.g., S3, HDFS, Hive, Kafka, MongoDB). Experience with cloud platforms (AWS, Azure, GCP) and their data services (e.g., S3, ADLS, GCS, EMR, Athena, Databricks). Familiarity with data governance, metadata management, and data cataloging tools. Experience with scripting languages (Python) for data manipulation and automation. Knowledge of containerization technologies (Docker, Kubernetes) is a plus. Solid understanding of data warehousing concepts, data modeling (dimensional, 3NF), and ETL/ELT processes. Experience with large-scale data lakes and various file formats (Parquet, ORC, Avro). Excellent analytical and problem-solving skills with a keen eye for detail. Strong verbal and written communication skills, with the ability to articulate complex technical concepts to both technical and non-technical audiences. Bachelor's or Master's degree in Computer Science, Engineering, Information Systems, or a related technical field. Certification in Starburst, Trino, or relevant cloud data platforms. Experience with data visualization tools (e.g., Tableau, Power BI, Looker) for connecting to Starburst. Familiarity with CI/CD practices for data pipelines and infrastructure as code. Experience in the financial services industry or other highly regulated environments.

Requirements

10+ years of hands-on experience in data engineering, data warehousing, or big data analytics projects.
Minimum of 3 years of direct experience with Starburst Enterprise (Trino/PrestoSQL) in a production environment, including configuration and knowledge in administration.
Deep expertise in SQL, with advanced query writing and optimization skills specific to Starburst/Trino.
Proven experience with Apache Iceberg, including table creation, schema evolution, time travel, and performance optimization when used with Starburst/Trino.
Strong understanding of distributed systems and parallel processing architectures.
Proficiency with various data connectors and their configuration within Starburst (e.g., S3, HDFS, Hive, Kafka, MongoDB).
Experience with cloud platforms (AWS, Azure, GCP) and their data services (e.g., S3, ADLS, GCS, EMR, Athena, Databricks).
Familiarity with data governance, metadata management, and data cataloging tools.
Experience with scripting languages (Python) for data manipulation and automation.
Solid understanding of data warehousing concepts, data modeling (dimensional, 3NF), and ETL/ELT processes.
Experience with large-scale data lakes and various file formats (Parquet, ORC, Avro).
Excellent analytical and problem-solving skills with a keen eye for detail.
Strong verbal and written communication skills, with the ability to articulate complex technical concepts to both technical and non-technical audiences.
Bachelor's or Master's degree in Computer Science, Engineering, Information Systems, or a related technical field.

Nice To Haves

Knowledge of containerization technologies (Docker, Kubernetes) is a plus.
Certification in Starburst, Trino, or relevant cloud data platforms.
Experience with data visualization tools (e.g., Tableau, Power BI, Looker) for connecting to Starburst.
Familiarity with CI/CD practices for data pipelines and infrastructure as code.
Experience in the financial services industry or other highly regulated environments.

Responsibilities

Lead the design and architecture of Starburst-based data solutions, ensuring scalability, performance, and reliability for enterprise-level data platforms.
Design, implement, and manage data lakes utilizing Apache Iceberg tables with Starburst, optimizing for ACID transactions, schema evolution, time travel, and high-performance queries.
Develop, implement, and optimize complex SQL queries and data pipelines using Starburst to extract, transform, and load (ETL/ELT) data from various sources.
Proactively monitor, troubleshoot, and fine-tune Starburst clusters and queries to ensure optimal performance, resource utilization, and cost efficiency.
Implement and manage data federation strategies using Starburst connectors to seamlessly integrate and query data across disparate systems (e.g., Data Lakes, RDBMS, NoSQL databases, Cloud Storage).
Collaborate with security and governance teams to implement robust data access controls, encryption, and compliance measures within the Starburst ecosystem.
Provide technical leadership and mentorship to data engineers and analysts, fostering best practices in Starburst usage, data modeling, and query optimization.
Contribute to the long-term data strategy and roadmap, evaluating new features, technologies, and methodologies related to Starburst and the broader data ecosystem.
Work closely with data scientists, business analysts, application developers, and other stakeholders to understand data requirements and deliver effective solutions.
Create and maintain comprehensive technical documentation for Starburst configurations, data models, query patterns, and operational procedures.