University of Texas at Austin-posted 8 days ago
Full-time • Mid Level
Remote • Prior Lake, MN
251-500 employees

The Principal Data Engineer for the UT Data Hub improves university outcomes and advances the UT mission to transform lives for the benefit of society by increasing the useability and value of institutional data. You will be responsible for leading the data engineering team to innovate and implement the newest data engineering trends and best practices to create complex data pipelines withing UT’s cloud data ecosystem in support of academic and administrative needs. In collaboration with our team of data professionals, you will help build and run a modern data hub to enable advanced data-driven decision making for UT. You will leverage your creativity to solve complex technical problems and build effective relationships through open communication within the team and outside partners. This particular position has a heavy emphasis on Databricks and AI Readiness.

  • Technical Leadership: Architect, design, and lead the development of enterprise-scale, production-grade data platforms and pipelines using Databricks and cloud-native technologies (AWS, Azure, or GCP). Champion the adoption of the Databricks Lakehouse architecture to unify data warehousing, data science, and machine learning workloads across the organization. Guide the design and deployment of AI-ready data pipelines to support predictive analytics, generative AI, and advanced decision intelligence use cases. Define and enforce data engineering standards, including performance optimization, scalability, data observability, and cost efficiency. Oversee code reviews, architecture reviews, and system design discussions to ensure technical excellence and maintainability across the engineering team. Lead the implementation of robust data quality, governance, and compliance frameworks, leveraging Databricks Unity Catalog and modern metadata management tools. Solve complex data architecture and integration challenges using advanced technologies such as Spark, Delta Live Tables, Airflow, and MLflow. Drive the development of automated, CI/CD-enabled data workflows and promote best practices in data infrastructure as code (IaC) and DevOps for data.
  • Project Management & Collaboration: Provide strategic technical leadership and mentorship to data engineering teams, fostering a collaborative environment that promotes innovation, accountability, and growth. Collaborate closely with data architects, AI/ML engineers, and analytics teams to align data solutions with organizational goals and research initiatives. Engage with cross-campus and cross-departmental technical groups to evangelize modern data practices and accelerate AI transformation initiatives. Lead knowledge-sharing sessions and architecture reviews on emerging data engineering trends, Databricks advancements, and AI integration techniques.
  • Communication: Effectively communicate technical strategies, project status, risks, and architecture decisions to both technical and non-technical stakeholders. Translate complex data engineering concepts into clear business impacts, helping decision-makers understand opportunities and trade-offs. Produce clear and detailed technical documentation, design specifications, and operational playbooks to support long-term scalability and training. Advocate for data engineering as a foundational enabler of AI, analytics, and digital transformation initiatives across the institution.
  • Innovation: Lead research and development efforts to evaluate and implement cutting-edge technologies within the Databricks ecosystem and broader AI/data landscape. Conduct feasibility studies and proofs of concept (POCs) for next-generation architectures involving AI model integration, real-time streaming, and intelligent automation. Partner with academic, administrative, and campus stakeholders to pilot AI-enabled data systems, such as model-assisted data validation and automated feature generation. Stay ahead of emerging trends in data engineering, AI readiness, and cloud infrastructure, continuously recommending and implementing innovative solutions.
  • Other: Contribute to recruitment, hiring, and onboarding of new data engineering team members. Represent the data engineering function in strategic planning discussions and cross-organizational technology initiatives. Perform other duties as assigned, aligned with the mission to build a secure, scalable, and AI-enabled data ecosystem.
  • Bachelor’s or Master’s degree in Computer Science, Information Systems, Engineering, or equivalent professional experience.
  • 5+ years of experience designing, implementing, and maintaining complex, production-grade data pipelines and enterprise data platforms.
  • 5+ years of hands-on experience with cloud-based data engineering, preferably in Amazon Web Services (AWS), with strong command of services such as Glue, S3, Lambda, Redshift, and EMR.
  • 3+ years of experience defining cloud data architecture and data strategy in large, distributed enterprise environments.
  • Deep expertise with Databricks Lakehouse Platform, including Delta Lake, Delta Live Tables, and Unity Catalog, for scalable data ingestion, transformation, and governance.
  • Proficiency in Python, PySpark, and SQL, with demonstrated experience in building ETL/ELT workflows across structured and unstructured data sources.
  • Proven ability to design and implement high-performance, AI-ready data architectures supporting analytics, machine learning, and real-time data processing.
  • Experience developing and deploying Continuous Integration / Continuous Delivery (CI/CD) pipelines for data engineering using tools such as Databricks Repos, GitHub Actions, or Terraform.
  • Strong foundation in test-driven data engineering, including automated data quality, validation, and observability frameworks.
  • Advanced knowledge of data governance, metadata management, and security compliance in cloud and Databricks environments.
  • Excellent systems analysis, design, and troubleshooting skills with the ability to address performance bottlenecks in distributed data systems.
  • Exceptional communication skills, with the ability to convey complex technical concepts clearly to both technical and non-technical stakeholders.
  • Proven experience leading and mentoring teams, fostering technical excellence and innovation.
  • Self-motivated and capable of working independently in a dynamic, evolving technology landscape.
  • Equivalent combination of relevant education and experience may be substituted as appropriate.
  • 10+ years of experience in Data Engineering, Data Architecture, or related fields, including 5+ years of hands-on work with Databricks or equivalent large-scale data platforms.
  • Demonstrated experience architecting and optimizing Lakehouse environments that integrate data science, analytics, and AI workloads.
  • Proven success in implementing AI/ML-ready data pipelines and collaborating with Data Scientists and MLOps teams using tools like MLflow, Feature Store, or model registries.
  • 5+ years of experience applying Agile software development methodologies and using tools such as JIRA, Confluence, or Azure DevOps for project tracking and delivery.
  • Expertise in distributed data processing and streaming technologies such as Apache Spark, Kafka, Flink, or Airflow for orchestration and automation.
  • Experience designing and operationalizing data observability and cost optimization strategies within Databricks and cloud environments.
  • Strong understanding of data mesh, data fabric, and modern metadata management principles for large-scale organizations.
  • Professional certifications such as Databricks Certified Data Engineer Professional, AWS Solutions Architect, or AWS Data Analytics Specialty are highly desirable.
  • Demonstrated ability to drive innovation, introduce emerging technologies, and lead proofs of concept (POCs) for AI integration, automation, or advanced analytics.
  • Commitment to continuous learning and technology leadership, staying current with advancements in Databricks, AI engineering, and modern cloud data ecosystems.
  • Competitive health benefits (Employee premiums covered at 100%; family premiums at 50%)
  • Vision, dental, life, and disability insurance options
  • Paid vacation, sick leave, and holidays
  • Teachers Retirement System of Texas (a defined benefit retirement plan)
  • Additional voluntary retirement programs: tax sheltered annuity 403(b) and a deferred compensation program 457(b)
  • Flexible spending account options for medical and childcare expenses
  • Training and conference opportunities
  • Tuition assistance
  • Athletic ticket discounts
  • Access to UT Austin's libraries and museums
  • Free rides on all UT Shuttle and Capital Metro buses with staff ID card
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service