About The Position

Data Engineering Leadership: Participate in design and development of data pipelines for ingestion, transformation, and loading of data from various sources (databases, APIs, streaming platforms) into our data warehouse/lake, ensuring seamless data flow and accessibility. Develop data models that support business requirements and analytical needs. Optimize data models for query performance and data accessibility. Database Optimization: Write optimized and maintainable SQL queries and leverage SQLAlchemy for efficient database interaction, ensuring high performance and data accuracy. Data Quality Assurance: Implement robust data quality checks and monitoring systems to ensure data integrity and accuracy, proactively identifying and resolving data issues. Data Governance Contribution: Contribute to the design and implementation of data governance policies and procedures, ensuring compliance with regulatory requirements and internal standards. Technology Innovation: Continuously research and implement new technologies and best practices to improve the efficiency, scalability, and resilience of our data platform. Cloud Deployment & Monitoring: Take ownership of the deployment and monitoring of data pipelines and related infrastructure on cloud platforms such as OpenShift, ECS, or Kubernetes, ensuring optimal performance and reliability. Operational Excellence: Ability to occasionally work a non-standard shift, including nights and/or weekends, and/or have on-call responsibilities to support critical data operations.

Requirements

  • 6+ years of hands-on experience in a Data Engineering role
  • Strong proficiency in Python (version 3.6+), with experience in Python packaging and shared libraries like Pandas and NumPy.
  • Experience implementing REST APIs in Python using microframeworks like Flask.
  • Extensive experience working with relational databases and NoSQL databases
  • Hand on skills in writing complex SQL and optimizing queries for performance.
  • Solid understanding of data warehousing concepts and experience working with large datasets, including data modeling and ETL processes.
  • Experience working in a Continuous Integration and Continuous Delivery environment and familiarity with tools like Jenkins, TeamCity, SonarQube, OpenShift, ECS, or Kubernetes.
  • Proficient in industry-standard best practices such as Design Patterns, Coding Standards, Coding modularity, and Prototyping.
  • Strong communication skills, both written and verbal, with the ability to explain complex technical concepts to both technical and non-technical audiences.
  • Bachelor's degree in Computer Science, Software Engineering, or a related field.

Nice To Haves

  • Experience with data visualization tools and techniques for presenting data insights effectively.
  • Familiarity with agile development methodologies and experience working in agile teams.
  • Experience with workflow management tools like Airflow (experience with PySpark or PyFlink is a major plus).
  • Ability to occasionally work a non-standard shift, including nights and/or weekends, and/or have on-call responsibilities to support critical data operations.
  • Leadership & Mentorship: Ability to guide and mentor junior developers, fostering a collaborative team environment and promoting professional growth.

Responsibilities

  • Participate in design and development of data pipelines for ingestion, transformation, and loading of data
  • Develop data models that support business requirements and analytical needs
  • Optimize data models for query performance and data accessibility
  • Write optimized and maintainable SQL queries and leverage SQLAlchemy for efficient database interaction
  • Implement robust data quality checks and monitoring systems to ensure data integrity and accuracy
  • Contribute to the design and implementation of data governance policies and procedures
  • Continuously research and implement new technologies and best practices to improve the efficiency, scalability, and resilience of our data platform
  • Take ownership of the deployment and monitoring of data pipelines and related infrastructure on cloud platforms such as OpenShift, ECS, or Kubernetes
  • Design, develop, and maintain database schemas and models.
  • Write and optimize SQL queries for data retrieval, manipulation, and reporting.
  • Communicate technical concepts and solutions effectively to both technical and non-technical audiences.
  • Provide technical support and troubleshooting for production systems.
  • Stay up-to-date with the latest trends and technologies in Python development, database systems, and data engineering.
  • Evaluate and recommend new tools and technologies to improve development efficiency and product quality.
  • Contribute to the continuous improvement of development processes and practices.
  • Ability to guide and mentor junior developers
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service