Databricks Architect - Remote

Cognizant Technology SolutionsSan Francisco, CA
97d$88,200 - $139,500Remote

About The Position

Databricks Architect with expertise in building scalable data solutions on AWS Supply Chain, Consumer Goods / Retail domain knowledge required. Experience in modernizing (solutions and hands-on execution) enterprise data platforms from legacy to AWS cloud. Deep knowledge and experience in Data Modelling (OLAP and OLTP), Data Lake, Data Warehousing, ETL/ELT pipelines on both on-premise legacy and on AWS. Lead migration and modernization of legacy ETL processes from Informatica, DataStage, and SQL Server to cloud-native solutions. Design and optimize data workflows for ingestion, transformation, and analytics using AWS-native services. Design and Build data pipelines and solutions using Databricks (PySpark and SparkSQL) on AWS. Experience with building Medallion architecture based data estates. Experience in building Databricks Delta Lake based Lakehouse using DLT, PySpark Jobs, Databricks Workflows. Proficient in SQL, Python, PySpark, S3, Lambda. Working knowledge of Git, CI/CD, VS Code. Proficient in AWS data ingestion stack. Must have knowledge and ability to scale up on Glue, Lambda, Step Functions, Spark Streaming and other services on a need basis. Implementation experience of several key data concepts such as CDC (Change Data Capture), Streaming and/or Batch ingestion, Pull v/s Push paradigms, Source to Target mapping, and so on. Collaborate with cross-functional teams to gather requirements and deliver scalable, secure, and high-performance data solutions. Strong semantic layer modelling and implementation experience. Establish best practices for data governance, lineage, and quality across hybrid environments. Provide technical leadership and mentoring to data engineers and developers. Monitor and troubleshoot performance issues across Databricks and AWS services. Understanding of key reporting stack such as Power BI, Tableau, Alteryx, Excel BI Add-Ins a plus.

Requirements

  • Expertise in building scalable data solutions on AWS.
  • Experience in modernizing enterprise data platforms from legacy to AWS cloud.
  • Deep knowledge and experience in Data Modelling (OLAP and OLTP), Data Lake, Data Warehousing, ETL/ELT pipelines.
  • Proficient in SQL, Python, PySpark, S3, Lambda.
  • Working knowledge of Git, CI/CD, VS Code.
  • Proficient in AWS data ingestion stack.
  • Knowledge and ability to scale up on Glue, Lambda, Step Functions, Spark Streaming.
  • Implementation experience of key data concepts such as CDC, Streaming and/or Batch ingestion.
  • Strong semantic layer modelling and implementation experience.

Nice To Haves

  • Understanding of key reporting stack such as Power BI, Tableau, Alteryx, Excel BI Add-Ins.
  • Databricks Certified Data Engineer Associate.

Responsibilities

  • Lead migration and modernization of legacy ETL processes from Informatica, DataStage, and SQL Server to cloud-native solutions.
  • Design and optimize data workflows for ingestion, transformation, and analytics using AWS-native services.
  • Design and build data pipelines and solutions using Databricks (PySpark and SparkSQL) on AWS.
  • Build Medallion architecture based data estates.
  • Build Databricks Delta Lake based Lakehouse using DLT, PySpark Jobs, Databricks Workflows.
  • Collaborate with cross-functional teams to gather requirements and deliver scalable, secure, and high-performance data solutions.
  • Establish best practices for data governance, lineage, and quality across hybrid environments.
  • Provide technical leadership and mentoring to data engineers and developers.
  • Monitor and troubleshoot performance issues across Databricks and AWS services.

Benefits

  • Medical/Dental/Vision/Life Insurance
  • Paid holidays plus Paid Time Off
  • 401(k) plan and contributions
  • Long-term/Short-term Disability
  • Paid Parental Leave
  • Employee Stock Purchase Plan
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service