Full Stack Data Engineer

Ford•Dearborn, MI

29d

About The Position

Data Pipeline Architect & Builder: Spearhead the design, development, and maintenance of scalable data ingestion and curation pipelines from diverse sources. Ensure data is standardized, high-quality, and optimized for analytical use. Leverage cutting-edge tools and technologies, including Python, SQL, and DBT/Dataform, to build robust and efficient data pipelines. End-to-End Integration Expert: Utilize your full-stack skills to contribute to seamless end-to-end development, ensuring smooth and reliable data flow from source to insight. GCP Data Solutions Leader: Leverage your deep expertise in GCP services (BigQuery, Dataflow, Pub/Sub, Cloud Functions, etc.) to build and manage data platforms that not only meet but exceed business needs and expectations. Data Governance & Security Champion: Implement and manage robust data governance policies, access controls, and security best practices, fully utilizing GCP's native security features to protect sensitive data. Data Workflow Orchestrator: Employ Astronomer and Terraform for efficient data workflow management and cloud infrastructure provisioning, championing best practices in Infrastructure as Code (IaC). Performance Optimization Driver: Continuously monitor and improve the performance, scalability, and efficiency of data pipelines and storage solutions, ensuring optimal resource utilization and cost-effectiveness. Collaborative Innovator: Collaborate effectively with data architects, application architects, service owners, and cross-functional teams to define and promote best practices, design patterns, and frameworks for cloud data engineering. Automation & Reliability Advocate: Proactively automate data platform processes to enhance reliability, improve data quality, minimize manual intervention, and drive operational efficiency. Effective Communicator: Clearly and transparently communicate complex technical decisions to both technical and non-technical stakeholders, fostering understanding and alignment. Continuous Learner: Stay ahead of the curve by continuously learning about industry trends and emerging technologies, proactively identifying opportunities to improve our data platform and enhance our capabilities. Business Impact Translator: Translate complex business requirements into optimized data asset designs and efficient code, ensuring that our data solutions directly contribute to business goals. Documentation & Knowledge Sharer: Develop comprehensive documentation for data engineering processes, promoting knowledge sharing, facilitating collaboration, and ensuring long-term system maintainability. Established and active employee resource groups

Requirements

Bachelor's degree in Computer Science, Information Technology, Information Systems, Data Analytics, or a related field (or equivalent combination of education and experience).
5-7 years of experience in Data Engineering or Software Engineering, with at least 2 years of hands-on experience building and deploying cloud-based data platforms (GCP preferred).
Strong proficiency in SQL, Java, and Python, with practical experience in designing and deploying cloud-based data pipelines using GCP services like BigQuery, Dataflow, and DataProc.
Solid understanding of Service-Oriented Architecture (SOA) and microservices, and their application within a cloud data platform.
Experience with relational databases (e.g., PostgreSQL, MySQL), NoSQL databases, and columnar databases (e.g., BigQuery).
Knowledge of data governance frameworks, data encryption, and data masking techniques in cloud environments.
Familiarity with CI/CD pipelines, Infrastructure as Code (IaC) tools like Terraform and Tekton, and other automation frameworks.
Excellent analytical and problem-solving skills, with the ability to troubleshoot complex data platform and microservices issues.
Experience in monitoring and optimizing cost and compute resources for processes in GCP technologies (e.g., BigQuery, Dataflow, Cloud Run, DataProc).
A passion for data, innovation, and continuous learning

Responsibilities

Design, develop, and maintain scalable data ingestion and curation pipelines from diverse sources.
Ensure data is standardized, high-quality, and optimized for analytical use.
Leverage Python, SQL, and DBT/Dataform to build robust and efficient data pipelines.
Contribute to seamless end-to-end development, ensuring smooth and reliable data flow from source to insight.
Build and manage data platforms using GCP services (BigQuery, Dataflow, Pub/Sub, Cloud Functions, etc.).
Implement and manage robust data governance policies, access controls, and security best practices.
Employ Astronomer and Terraform for efficient data workflow management and cloud infrastructure provisioning.
Monitor and improve the performance, scalability, and efficiency of data pipelines and storage solutions.
Collaborate with data architects, application architects, service owners, and cross-functional teams.
Automate data platform processes to enhance reliability, improve data quality, minimize manual intervention, and drive operational efficiency.
Communicate complex technical decisions to both technical and non-technical stakeholders.
Continuously learn about industry trends and emerging technologies.
Translate complex business requirements into optimized data asset designs and efficient code.
Develop comprehensive documentation for data engineering processes.