Enterprise Data Operations Associate Manager

PepsiCoPlano, TX
4hHybrid

About The Position

Job Title: Enterprise Data Operations Associate Manager Employer: PepsiCo, Inc. Location: 7701 Legacy Drive, Plano, TX 75024 Responsibilities Duties: Review and contribute to code development while building and maintaining scalable data pipelines from internal and external sources to support product launches and ensure data quality. Develop automation and monitoring frameworks to capture pipeline performance metrics and KPIs. Implement best practices for systems integration, security, performance, and data management. Collaborate with data science and product teams on solutioning and POCs, and drive business value through increased adoption of data, data science, and business intelligence solutions. Work in a hybrid environment with in-house, on-premise data sources as well as cloud and remote systems. Design scalable patterns and architecture to support both batch and real-time data products & platform using big data technologies such as Hadoop, SQL Data Warehouse, EMR, Spark, DataBricks, Snowflake, Azure Synapse or other Cloud data warehousing technologies. Ensure physical and logical data models are designed with an extensible philosophy to support future, unknown use cases with minimal rework. Work with product managers and data stewards within the enterprise data governance process to define and conceptualize data models across enterprise master data, transaction data, and informational data and implement those models into the enterprise data model. Partner with the data science team to standardize their classification of unstructured data into standard structures for data discovery and action by business customers and stakeholders. Help with Intake prioritization, decision making of what to pursue across a wide base of users/stakeholders and across products, databases and services. Manage and scale data pipelines from internal and external data sources to support new product launches and drive data quality across data products. Build and own the automation and monitoring frameworks that captures metrics and operational KPIs for data pipeline quality and performance. Empower the business by creating value through the increased adoption of data, data science and business intelligence landscape and collaborate with internal clients to drive solutioning and POC discussions. Telecommuting permitted 60%: work may be performed within normal commuting distance from the PepsiCo office in Plano, TX #LI-DNI

Requirements

  • Position requires a Bachelor's degree (US or Foreign Equivalent) in Computer Science, Computer Engineering, Management Information Systems, or a related field and six (6) years of experience in job offered or related role.
  • Must have five (5) years of experience with: Hands-on software development, data engineering, and systems architecture.
  • Must have four (4) years of experience with: Data Lake Infrastructure, Data Warehousing, and Data Analytics tools; Cloud data engineering experience in at least one cloud (Azure, AWS, or GCP); Integration of multi cloud services with on-premises technologies; Data modeling, data warehousing, and building high-volume ETL/ELT pipelines; Building/operating highly available, distributed systems of data extraction, ingestion, and processing of large data sets; At least one MPP database technology such as Redshift, Synapse, or SnowFlake; and Version control systems Github and Azure Devops.
  • Must have two (2) years of experience with: Developing enterprise data models; Experience with data profiling and at least one data quality tool such as Apache Griffin, Deequ, or Great Expectations.

Responsibilities

  • Review and contribute to code development while building and maintaining scalable data pipelines from internal and external sources to support product launches and ensure data quality.
  • Develop automation and monitoring frameworks to capture pipeline performance metrics and KPIs.
  • Implement best practices for systems integration, security, performance, and data management.
  • Collaborate with data science and product teams on solutioning and POCs, and drive business value through increased adoption of data, data science, and business intelligence solutions.
  • Work in a hybrid environment with in-house, on-premise data sources as well as cloud and remote systems.
  • Design scalable patterns and architecture to support both batch and real-time data products & platform using big data technologies such as Hadoop, SQL Data Warehouse, EMR, Spark, DataBricks, Snowflake, Azure Synapse or other Cloud data warehousing technologies.
  • Ensure physical and logical data models are designed with an extensible philosophy to support future, unknown use cases with minimal rework.
  • Work with product managers and data stewards within the enterprise data governance process to define and conceptualize data models across enterprise master data, transaction data, and informational data and implement those models into the enterprise data model.
  • Partner with the data science team to standardize their classification of unstructured data into standard structures for data discovery and action by business customers and stakeholders.
  • Help with Intake prioritization, decision making of what to pursue across a wide base of users/stakeholders and across products, databases and services.
  • Manage and scale data pipelines from internal and external data sources to support new product launches and drive data quality across data products.
  • Build and own the automation and monitoring frameworks that captures metrics and operational KPIs for data pipeline quality and performance.
  • Empower the business by creating value through the increased adoption of data, data science and business intelligence landscape and collaborate with internal clients to drive solutioning and POC discussions.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service