University of Texas at Austin-posted 14 days ago
Full-time • Mid Level
Remote • Prior Lake, MN
251-500 employees

The Lead Data Engineer for the UT Data Hub improves university outcomes and advances the UT mission to transform lives for the benefit of society by increasing the useability and value of institutional data. You will lead senior data engineers and data engineers to create complex data pipelines within UT’s cloud data ecosystem in support of academic and administrative needs. In collaboration with our team of data professionals, you will help build and run a modern data hub to enable advanced data-driven decision making for UT. You will leverage your creativity to solve complex technical problems and build effective relationships through open communication within the team and outside partners.

  • Technical Leadership: Design, architect, and deliver production-grade, scalable data pipelines and AI-ready data platforms using Databricks, AWS cloud-native services and modern data engineering frameworks.
  • Lead end-to-end implementation of lakehouse data pipelines, ensuring performance, reliability, and cost efficiency.
  • Champion industry best practices for data engineering.
  • Conduct and participate in peer code reviews to maintain code quality and consistency across the team.
  • Proactively identify and resolve bottlenecks in data ingestion, transformation, and orchestration processes using Databricks Delta Live Tables, Spark optimization techniques, and workflow automation.
  • Implement systems for data quality, observability, governance, and compliance using tools such as Unity Catalog, Delta Lake, and data validation frameworks.
  • Lead technical knowledge-sharing sessions on topics such as AI/ML integration, data lakehouse architecture, and emerging data technologies.
  • Project Management: Define project milestones, timelines, and deliverables for data and AI initiatives, ensuring timely and high-quality outcomes.
  • Collaborate with both internal and external stakeholders such as data architects, system architects, business users, Agile team members, and other D2I internal groups.
  • Manage project priorities, sprint planning, and team workloads while balancing innovation with delivery.
  • Communicate risks, dependencies, and resource constraints effectively, and develop mitigation plans for on-time project delivery.
  • Team Management and Leadership: Supervise and mentor a team of Data Engineers (2–5 individuals) working on cloud, Databricks, and AI pipeline initiatives.
  • Foster a culture of continuous learning, experimentation, and technical excellence, encouraging engineers to explore AI and automation use cases.
  • Participate in recruiting, onboarding, and developing data engineering talent with strong Databricks and AI skillsets.
  • Conduct performance reviews, set development goals, and create individualized growth plans for team members.
  • Encourage collaboration across Data, AI/ML, Analytics, and Infrastructure teams to drive cross-functional success.
  • Communication: Provide regular updates on project progress, technical challenges, and project milestones to both technical and business stakeholders.
  • Translate complex technical concepts related to Databricks, AI, and data architecture into clear narratives for non-technical audiences.
  • Foster a transparent communication culture and provide actionable feedback to promote a growth mindset.
  • Ensure all data engineering processes, architectures, and standards are well-documented for reuse, governance, and knowledge continuity.
  • Innovation and Other Duties: Stay current with advancements in AI, data engineering, and Databricks ecosystem, evaluating new tools and frameworks for potential adoption.
  • Pilot and promote innovative solutions such as AI-assisted data quality checks, data observability automation, and intelligent pipeline optimization.
  • Perform other duties as assigned, contributing to the organization’s data-driven and AI-enabled transformation.
  • Bachelor’s or Master’s degree in Computer Science, Information Systems, Engineering, or equivalent professional experience.
  • 5+ years of experience designing, implementing, and optimizing complex, production-grade data pipelines or enterprise-scale data platforms.
  • 5 years of experience in cloud-based data engineering using Databricks and Amazon Web Services (AWS) (e.g., Glue, S3, Lambda, Redshift).
  • 3+ years of experience managing or leading teams of data and/or software engineers, including mentorship, performance management, and project delivery.
  • Expertise in Python, PySpark, and SQL, with strong understanding of data modeling, stored procedures, and scalable data transformations.
  • Proven experience architecting and implementing ETL/ELT solutions across relational, non-relational, and lakehouse environments (e.g., Delta Lake, Parquet, or Iceberg).
  • Experience designing and managing CI/CD pipelines and infrastructure as code (IaC) using tools such as Databricks Repos, CDK, Terraform, or GitHub Actions.
  • Demonstrated knowledge of test-driven development (TDD) and data quality frameworks, ensuring reliability and reproducibility across data workflows.
  • Deep understanding of data governance, security, and compliance standards in cloud environments.
  • Excellent analytical, problem-solving, and debugging skills across distributed data systems.
  • Proven ability to communicate complex technical concepts clearly to both technical and non-technical audiences.
  • Experience supervising, mentoring, and guiding junior team members on technical and professional development.
  • Equivalent combination of relevant education and experience may be substituted as appropriate.
  • 8+ years of experience in Data Engineering or related fields, including 5+ years of hands-on experience building and optimizing data pipelines on Databricks or similar large-scale data platforms.
  • Proven experience implementing lakehouse architectures leveraging Databricks Delta Lake, Delta Live Tables, and Unity Catalog for governance and scalability.
  • Experience designing AI-ready data platforms and integrating machine learning pipelines using tools such as MLflow or model registry frameworks.
  • 3+ years of experience managing or leading cross-functional technical teams, fostering collaboration between Data Engineering, Analytics, and AI/ML teams.
  • 5+ years of experience with Agile software development methodologies and project tracking systems such as JIRA.
  • Expertise in distributed data processing and streaming frameworks, such as Apache Spark, Kafka, Flink, or Airflow, for orchestration and automation.
  • Strong familiarity with data observability, cost optimization, and performance tuning in Databricks and cloud-native architecture.
  • Professional certifications such as Databricks Certified Data Engineer Professional or AWS Solutions Architect or AWS Data Analytics Specialty are highly desirable.
  • Demonstrated ability to introduce new technologies and best practices to modernize existing data environments and promote AI/analytics maturity across the organization.
  • Passion for continuous learning and staying current with emerging technologies in data engineering, AI integration, and Databricks ecosystem advancements.
  • Competitive health benefits (Employee premiums covered at 100%; family premiums at 50%)
  • Vision, dental, life, and disability insurance options
  • Paid vacation, sick leave, and holidays
  • Teachers Retirement System of Texas (a defined benefit retirement plan)
  • Additional voluntary retirement programs: tax sheltered annuity 403(b) and a deferred compensation program 457(b)
  • Flexible spending account options for medical and childcare expenses
  • Training and conference opportunities
  • Tuition assistance
  • Athletic ticket discounts
  • Access to UT Austin's libraries and museums
  • Free rides on all UT Shuttle and Capital Metro buses with staff ID card
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service