Data Pipeline Architect & Builder

Stefanini GroupDearborn, MI
4dOnsite

About The Position

Stefanini Group is hiring! Stefanini is looking for a Data Pipeline Architect & Builder, Dearborn, MI (Onsite) For quick apply, please reach out Fardeen Ali at 248-582-6473/ [email protected] We are looking for Data Pipeline Architect & Builder who will Spearhead the design, development, and maintenance of scalable data ingestion and curation pipelines from diverse sources. Ensure data is standardized, high-quality, and optimized for analytical use. Leverage cutting-edge tools and technologies, including Python, SQL, and DBT/Data form, to build robust and efficient data pipelines. Utilize your full-stack (End to End) Integration skills to contribute to seamless end-to-end development, ensuring smooth and reliable data flow from source to insight. Working as a GCP Data Solutions Leader where leverage your deep expertise in GCP services (Big Query, Dataflow, Pub/Sub, Cloud Functions, etc.) to build and manage data platforms that not only meet but exceed business needs and expectations. Working as a Data Governance & Security Champion where implement and manage robust data governance policies, access controls, and security best practices, fully utilize GCP's native security features to protect sensitive data. Working as a Data Workflow Orchestrator where employ Astronomer and Terraform for efficient data workflow management and cloud infrastructure provisioning, championing best practices in Infrastructure as Code (IaC). Continuously monitor and improve the performance, scalability, and efficiency of data pipelines and storage solutions, ensuring optimal resource utilization and cost-effectiveness. Collaborate effectively with data architects, application architects, service owners, and cross-functional teams to define and promote best practices, design patterns, and frameworks for cloud data engineering. Proactively automate data platform processes to enhance reliability, improve data quality, minimize manual intervention, and drive operational efficiency. Clearly and transparently communicate complex technical decisions to both technical and non-technical stakeholders, fostering understanding and alignment. Working as a Continuous Learner where stay ahead of the curve by continuously learning about industry trends and emerging technologies, proactively identifying opportunities to improve our data platform and enhance our capabilities. Translate complex business requirements into optimized data asset designs and efficient code, ensuring that our data solutions directly contribute to business goals. Develop comprehensive documentation for data engineering processes, promote knowledge sharing, facilitating collaboration, and ensuring long-term system maintainability.

Requirements

  • Expertise in NoSQL, PostgreSQL, GCP, Python.
  • 5-7 years of experience in Data Engineering or Software Engineering.
  • Strong proficiency in SQL, Java, and Python, with practical experience in designing and deploying cloud-based data pipelines using GCP services like Big Query, Dataflow, and Dataproc.
  • Solid understanding of Service-Oriented Architecture (SOA) and microservices, and their application within a cloud data platform.
  • Experience with relational databases (e.g., PostgreSQL, MySQL), NoSQL databases, and columnar databases (e.g., Big Query).
  • Knowledge of data governance frameworks, data encryption, and data masking techniques in cloud environments.
  • Familiarity with CI/CD pipelines, Infrastructure as Code (IaC) tools like Terraform and Tekton, and other automation frameworks.
  • Excellent analytical and problem-solving skills, with the ability to troubleshoot complex data platform and microservices issues.
  • Experience in monitoring and optimizing cost and computing resources for processes in GCP technologies (e.g., Big Query, Dataflow, Cloud Run, Dataproc).
  • At least 2 years of hands-on experience building and deploying cloud-based data platforms (GCP preferred)

Responsibilities

  • Spearhead the design, development, and maintenance of scalable data ingestion and curation pipelines from diverse sources.
  • Ensure data is standardized, high-quality, and optimized for analytical use.
  • Leverage cutting-edge tools and technologies, including Python, SQL, and DBT/Data form, to build robust and efficient data pipelines.
  • Utilize your full-stack (End to End) Integration skills to contribute to seamless end-to-end development, ensuring smooth and reliable data flow from source to insight.
  • Leverage your deep expertise in GCP services (Big Query, Dataflow, Pub/Sub, Cloud Functions, etc.) to build and manage data platforms that not only meet but exceed business needs and expectations.
  • Implement and manage robust data governance policies, access controls, and security best practices, fully utilize GCP's native security features to protect sensitive data.
  • Employ Astronomer and Terraform for efficient data workflow management and cloud infrastructure provisioning, championing best practices in Infrastructure as Code (IaC).
  • Continuously monitor and improve the performance, scalability, and efficiency of data pipelines and storage solutions, ensuring optimal resource utilization and cost-effectiveness.
  • Collaborate effectively with data architects, application architects, service owners, and cross-functional teams to define and promote best practices, design patterns, and frameworks for cloud data engineering.
  • Proactively automate data platform processes to enhance reliability, improve data quality, minimize manual intervention, and drive operational efficiency.
  • Clearly and transparently communicate complex technical decisions to both technical and non-technical stakeholders, fostering understanding and alignment.
  • Stay ahead of the curve by continuously learning about industry trends and emerging technologies, proactively identifying opportunities to improve our data platform and enhance our capabilities.
  • Translate complex business requirements into optimized data asset designs and efficient code, ensuring that our data solutions directly contribute to business goals.
  • Develop comprehensive documentation for data engineering processes, promote knowledge sharing, facilitating collaboration, and ensuring long-term system maintainability.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service