Senior Data Engineer

NVIDIASanta Clara, CA
7d

About The Position

The NVIDIA Operations organization is seeking an experienced technology professional for the position of Senior Data Engineer to support initiatives for Operations. As a Data Engineer on our team, you will be an integral part of the Data and Advanced Analytics organization in Operations that is building the Operations Data Platform to turn data into business insights and results. What you'll be doing: Develop and implement End-to-End Data systems for Planning, Logistics and Services, and Sourcing initiatives. Design, deploy, and maintain AWS infrastructure (VPCs, IAM roles, S3, Lambda, Step Functions, Secrets Manager) following security guidelines Build and maintain data pipelines using PySpark and Python to transport data from source systems to the data lake, and manage and deploy cloud infrastructure using Terraform and Infrastructure-as-Code (IaC) principles Architect and administer Databricks workspaces including Unity Catalog, cluster policies, and workspace configurations, and implement security controls and access management across AWS and Databricks environments (IAM, RBAC, network security, encryption) Automate infrastructure provisioning and configuration using Terraform with proper state management and modular design Troubleshoot and resolve infrastructure issues across cloud environments, ensuring high availability and performance, and collaborate with security teams on compliance requirements, audit logging, and data governance policies Lead discussions with collaborators and IT to identify and implement the right data strategy given data sources, data locations, and use cases Provide operational support for ETL pipelines handling manufacturing test data and contract manufacturer (CM) datasets, and analyze and organize raw operational data including structured and unstructured data Implement CI/CD pipelines with proper unit testing and software engineering guidelines, and support scalable multi-functional data lake solutions.

Requirements

  • Bachelor's degree in Computer Science or Information Systems, or equivalent experience
  • 8+ years of relevant experience
  • Strong proficiency in Python, PySpark, and SQL
  • Experience with Terraform or similar IaC tools for managing AWS and Databricks resources
  • Solid understanding of software engineering principles: CI/CD, unit testing, code reviews
  • Experience architecting, designing, and maintaining data warehouses/data lakes for complex data ecosystems
  • Expert in data pipeline development including replication, streaming, API integration, and batch processing
  • Experience with Databricks and AWS (or similar cloud platforms)
  • Highly independent, able to lead key technical decisions and influence project roadmap
  • Strong analytical skills with attention to detail and accuracy and knowledge of supply chain business processes for planning, procurement, and logistics

Nice To Haves

  • Experience with contract manufacturing (CM) or ODM-related datasets
  • Master's degree in Computer Science or Information Systems, or equivalent experience
  • Experience with Airflow for file-level orchestration (download, upload, encryption, compression)
  • Familiarity with ODBC-based data ingestion and data modeling
  • Knowledge of operational processes in chips, boards, systems, and servers

Responsibilities

  • Develop and implement End-to-End Data systems for Planning, Logistics and Services, and Sourcing initiatives.
  • Design, deploy, and maintain AWS infrastructure (VPCs, IAM roles, S3, Lambda, Step Functions, Secrets Manager) following security guidelines
  • Build and maintain data pipelines using PySpark and Python to transport data from source systems to the data lake, and manage and deploy cloud infrastructure using Terraform and Infrastructure-as-Code (IaC) principles
  • Architect and administer Databricks workspaces including Unity Catalog, cluster policies, and workspace configurations, and implement security controls and access management across AWS and Databricks environments (IAM, RBAC, network security, encryption)
  • Automate infrastructure provisioning and configuration using Terraform with proper state management and modular design
  • Troubleshoot and resolve infrastructure issues across cloud environments, ensuring high availability and performance, and collaborate with security teams on compliance requirements, audit logging, and data governance policies
  • Lead discussions with collaborators and IT to identify and implement the right data strategy given data sources, data locations, and use cases
  • Provide operational support for ETL pipelines handling manufacturing test data and contract manufacturer (CM) datasets, and analyze and organize raw operational data including structured and unstructured data
  • Implement CI/CD pipelines with proper unit testing and software engineering guidelines, and support scalable multi-functional data lake solutions.

Benefits

  • highly competitive salaries
  • comprehensive benefits package
  • equity
  • benefits
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service