Full Stack Data Engineer

FordDearborn, MI
2d

About The Position

Data Pipeline Architect & Builder: Spearhead the design, development, and maintenance of scalable data ingestion and curation pipelines from diverse sources. Ensure data is standardized, high-quality, and optimized for analytical use. Leverage cutting-edge tools and technologies, including Python, SQL, and DBT/Dataform, to build robust and efficient data pipelines. End-to-End Integration Expert: Utilize your full-stack skills to contribute to seamless end-to-end development, ensuring smooth and reliable data flow from source to insight. GCP Data Solutions Leader: Leverage your deep expertise in GCP services (BigQuery, Dataflow, Pub/Sub, Cloud Functions, etc.) to build and manage data platforms that not only meet but exceed business needs and expectations. Data Governance & Security Champion: Implement and manage robust data governance policies, access controls, and security best practices, fully utilizing GCP's native security features to protect sensitive data. Data Workflow Orchestrator: Employ Astronomer and Terraform for efficient data workflow management and cloud infrastructure provisioning, championing best practices in Infrastructure as Code (IaC). Performance Optimization Driver: Continuously monitor and improve the performance, scalability, and efficiency of data pipelines and storage solutions, ensuring optimal resource utilization and cost-effectiveness. Collaborative Innovator: Collaborate effectively with data architects, application architects, service owners, and cross-functional teams to define and promote best practices, design patterns, and frameworks for cloud data engineering. Automation & Reliability Advocate: Proactively automate data platform processes to enhance reliability, improve data quality, minimize manual intervention, and drive operational efficiency. Effective Communicator: Clearly and transparently communicate complex technical decisions to both technical and non-technical stakeholders, fostering understanding and alignment. Continuous Learner: Stay ahead of the curve by continuously learning about industry trends and emerging technologies, proactively identifying opportunities to improve our data platform and enhance our capabilities. Business Impact Translator: Translate complex business requirements into optimized data asset designs and efficient code, ensuring that our data solutions directly contribute to business goals. Documentation & Knowledge Sharer: Develop comprehensive documentation for data engineering processes, promoting knowledge sharing, facilitating collaboration, and ensuring long-term system maintainability. Established and active employee resource groups

Requirements

  • Bachelor's degree in Computer Science, Information Technology, Information Systems, Data Analytics, or a related field (or equivalent combination of education and experience).
  • 5-7 years of experience in Data Engineering or Software Engineering, with at least 2 years of hands-on experience building and deploying cloud-based data platforms (GCP preferred).
  • Strong proficiency in SQL, Java, and Python, with practical experience in designing and deploying cloud-based data pipelines using GCP services like BigQuery, Dataflow, and DataProc.
  • Solid understanding of Service-Oriented Architecture (SOA) and microservices, and their application within a cloud data platform.
  • Experience with relational databases (e.g., PostgreSQL, MySQL), NoSQL databases, and columnar databases (e.g., BigQuery).
  • Knowledge of data governance frameworks, data encryption, and data masking techniques in cloud environments.
  • Familiarity with CI/CD pipelines, Infrastructure as Code (IaC) tools like Terraform and Tekton, and other automation frameworks.
  • Excellent analytical and problem-solving skills, with the ability to troubleshoot complex data platform and microservices issues.
  • Experience in monitoring and optimizing cost and compute resources for processes in GCP technologies (e.g., BigQuery, Dataflow, Cloud Run, DataProc).
  • A passion for data, innovation, and continuous learning

Responsibilities

  • Design, develop, and maintain scalable data ingestion and curation pipelines from diverse sources.
  • Ensure data is standardized, high-quality, and optimized for analytical use.
  • Leverage Python, SQL, and DBT/Dataform to build robust and efficient data pipelines.
  • Contribute to seamless end-to-end development, ensuring smooth and reliable data flow from source to insight.
  • Build and manage data platforms using GCP services (BigQuery, Dataflow, Pub/Sub, Cloud Functions, etc.).
  • Implement and manage robust data governance policies, access controls, and security best practices.
  • Employ Astronomer and Terraform for efficient data workflow management and cloud infrastructure provisioning.
  • Monitor and improve the performance, scalability, and efficiency of data pipelines and storage solutions.
  • Collaborate with data architects, application architects, service owners, and cross-functional teams.
  • Automate data platform processes to enhance reliability, improve data quality, minimize manual intervention, and drive operational efficiency.
  • Communicate complex technical decisions to both technical and non-technical stakeholders.
  • Continuously learn about industry trends and emerging technologies.
  • Translate complex business requirements into optimized data asset designs and efficient code.
  • Develop comprehensive documentation for data engineering processes.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service