Python and Database Developer, Vice President

Citi•New York, NY

About The Position

Data Engineering Leadership: Participate in design and development of data pipelines for ingestion, transformation, and loading of data from various sources (databases, APIs, streaming platforms) into our data warehouse/lake, ensuring seamless data flow and accessibility. Develop data models that support business requirements and analytical needs. Optimize data models for query performance and data accessibility. Database Optimization: Write optimized and maintainable SQL queries and leverage SQLAlchemy for efficient database interaction, ensuring high performance and data accuracy. Data Quality Assurance: Implement robust data quality checks and monitoring systems to ensure data integrity and accuracy, proactively identifying and resolving data issues. Data Governance Contribution: Contribute to the design and implementation of data governance policies and procedures, ensuring compliance with regulatory requirements and internal standards. Technology Innovation: Continuously research and implement new technologies and best practices to improve the efficiency, scalability, and resilience of our data platform. Cloud Deployment & Monitoring: Take ownership of the deployment and monitoring of data pipelines and related infrastructure on cloud platforms such as OpenShift, ECS, or Kubernetes, ensuring optimal performance and reliability. Operational Excellence: Ability to occasionally work a non-standard shift, including nights and/or weekends, and/or have on-call responsibilities to support critical data operations.

Requirements

6+ years of hands-on experience in a Data Engineering role
Strong proficiency in Python (version 3.6+), with experience in Python packaging and shared libraries like Pandas and NumPy.
Experience implementing REST APIs in Python using microframeworks like Flask.
Extensive experience working with relational databases and NoSQL databases
Hand on skills in writing complex SQL and optimizing queries for performance.
Solid understanding of data warehousing concepts and experience working with large datasets, including data modeling and ETL processes.
Experience working in a Continuous Integration and Continuous Delivery environment and familiarity with tools like Jenkins, TeamCity, SonarQube, OpenShift, ECS, or Kubernetes.
Proficient in industry-standard best practices such as Design Patterns, Coding Standards, Coding modularity, and Prototyping.
Strong communication skills, both written and verbal, with the ability to explain complex technical concepts to both technical and non-technical audiences.
Bachelor's degree in Computer Science, Software Engineering, or a related field.

Nice To Haves

Experience with data visualization tools and techniques for presenting data insights effectively.
Familiarity with agile development methodologies and experience working in agile teams.
Experience with workflow management tools like Airflow (experience with PySpark or PyFlink is a major plus).
Ability to occasionally work a non-standard shift, including nights and/or weekends, and/or have on-call responsibilities to support critical data operations.
Leadership & Mentorship: Ability to guide and mentor junior developers, fostering a collaborative team environment and promoting professional growth.

Responsibilities

Participate in design and development of data pipelines for ingestion, transformation, and loading of data
Develop data models that support business requirements and analytical needs
Optimize data models for query performance and data accessibility
Write optimized and maintainable SQL queries and leverage SQLAlchemy for efficient database interaction
Implement robust data quality checks and monitoring systems to ensure data integrity and accuracy
Contribute to the design and implementation of data governance policies and procedures
Continuously research and implement new technologies and best practices to improve the efficiency, scalability, and resilience of our data platform
Take ownership of the deployment and monitoring of data pipelines and related infrastructure on cloud platforms such as OpenShift, ECS, or Kubernetes
Design, develop, and maintain database schemas and models.
Write and optimize SQL queries for data retrieval, manipulation, and reporting.
Communicate technical concepts and solutions effectively to both technical and non-technical audiences.
Provide technical support and troubleshooting for production systems.
Stay up-to-date with the latest trends and technologies in Python development, database systems, and data engineering.
Evaluate and recommend new tools and technologies to improve development efficiency and product quality.
Contribute to the continuous improvement of development processes and practices.
Ability to guide and mentor junior developers