Senior Engineer

Microchip Technology Inc.•Chandler, AZ

22h•Onsite

About The Position

We are seeking an early-career Business Systems Engineer with a strong foundation in Data Pipeline Management, supported by hands-on experience with dbt Core, SQL, and Databricks on AWS. This role is intended for candidates who already understand the fundamentals of building, deploying, and supporting Data Universe systems and want to apply those skills in a production data platform. You will work on Data Engineering workflows end-to-end, from preparing high-quality data to supporting Analytics which enabled business decisions making, with hands-on ownership. Data Pipeline Management Perform regular data validation and cleansing to ensure the accuracy, integrity, and reliability of datasets Identify and resolve data pipeline failures (debug data anomalies and issues using SQL and dbt test results) Build and maintain ETL/ELT processes to move data from various sources into data warehouses or lakes Write and optimize SQL transformations that support feature engineering and model training Setup data catalog, execute and monitor data and ML workloads using Databricks On-board data product owners to Data Universe platform Support AWS-based lakehouse architectures, primarily using Amazon S3 Setup IAM (Identity and Access Management) roles, permissions, and secure access patterns Troubleshoot and optimize cloud-based AI and data workflows Support batch and micro-batch processing using Spark Manage data governance and security access and discovery using Databricks Unity Catalog Enable AI-Ready Data Models Design and maintain high-performance Delta Lake pipelines using the Medallion Architecture (Bronze, Silver, and Gold) Apply dbt tests and documentation to ensure data quality for AI consumption Architect curated datasets that maintain strict alignment with upstream raw sources, ensuring a seamless and transparent flow of information from ingestion to consumption Execute code reviews and follow established dbt and SQL standards Build and maintain training and inference workflows on Databricks Prepare and validate feature datasets used by ML models, ensuring correctness, consistency, and timeliness Support LLM-enabled use cases, such as: embedding generation, semantic search, retrieval-augmented generation (RAG) Monitor model inputs and outputs for data quality issues and unexpected behavior Understand how upstream data changes affect model performance, stability, and bias.

Requirements

Bachelor's or Master's degree in Computer Science, Data Science, Engineering, or a related field (or equivalent practical experience)
5+ years of experience in data engineering, data warehousing, or software engineering
Expert-level in SQL skills with experience transforming analytical datasets.
Hands-on and extensive experience with cloud platforms (AWS, Azure)
Hands-on and extensive experience with Databricks on AWS (academic, internship, or project-based)
Working knowledge of dbt Core, including models, tests, and documentation
Familiarity with Python for data processing or ML workflows
Experience using Git or another version control system
Good communication and interpersonal skills
Creativity and problem solving
Identifying continuous improvement opportunities
Study current practices, think outside the box, and foster creative analysis to complex problems
High learning agility and adapt quickly to priority changes
Be self-driven
Work independently and in a team to develop innovation

Responsibilities

Perform regular data validation and cleansing to ensure the accuracy, integrity, and reliability of datasets
Identify and resolve data pipeline failures (debug data anomalies and issues using SQL and dbt test results)
Build and maintain ETL/ELT processes to move data from various sources into data warehouses or lakes
Write and optimize SQL transformations that support feature engineering and model training
Setup data catalog, execute and monitor data and ML workloads using Databricks
On-board data product owners to Data Universe platform
Support AWS-based lakehouse architectures, primarily using Amazon S3
Setup IAM (Identity and Access Management) roles, permissions, and secure access patterns
Troubleshoot and optimize cloud-based AI and data workflows
Support batch and micro-batch processing using Spark
Manage data governance and security access and discovery using Databricks Unity Catalog
Design and maintain high-performance Delta Lake pipelines using the Medallion Architecture (Bronze, Silver, and Gold)
Apply dbt tests and documentation to ensure data quality for AI consumption
Architect curated datasets that maintain strict alignment with upstream raw sources, ensuring a seamless and transparent flow of information from ingestion to consumption
Execute code reviews and follow established dbt and SQL standards
Build and maintain training and inference workflows on Databricks
Prepare and validate feature datasets used by ML models, ensuring correctness, consistency, and timeliness
Support LLM-enabled use cases, such as: embedding generation, semantic search, retrieval-augmented generation (RAG)
Monitor model inputs and outputs for data quality issues and unexpected behavior
Understand how upstream data changes affect model performance, stability, and bias.