Databricks Architect

Inizio Partners Corp•New York, NY

19h•Remote

About The Position

Develop and optimize ETL pipelines from various data sources using Databricks on cloud (AWS, Azure, etc.) Experienced in implementing standardized pipelines with automated testing, Airflow scheduling, Azure DevOps for CI/CD, Terraform for infrastructure as code, and Splunk for monitoring Continuously improve systems through performance enhancements and cost reductions in compute and storage Data Processing and API Integration: Utilize Spark Structured Streaming for real-time data processing and integrate data outputs with REST APIs Lead Data Engineering Projects to manage and implement data-driven communication systems Experienced with Scrum and Agile Methodologies to coordinate global delivery teams, run scrum ceremonies, manage backlog items, and handle escalations Integrate data across different systems and platforms Strong verbal and written communication skills to manage client discussions

Requirements

8+ years experience in developing and implementing ETL pipelines from various data sources using Databricks on cloud
Some experience in insurance domain/ data is must
Programming Languages – SQL, Python
Technologies - IaaS (AWS or Azure or GCP), Databricks platform, Delta Lake storage, Spark (PySpark, Spark SQL)
Project Management using Agile, Scrum
B.S. Degree in a data-centric field (Mathematics, Economics, Computer Science, Engineering or other science field), Information Systems, Information Processing or engineering
Excellent communication & leadership skills, with the ability to lead and motivate team members
Ability to work independently with some level of ambiguity and juggle multiple demands

Nice To Haves

Airflow
Splunk
Kubernetes
Power BI
Git
Azure DevOps

Responsibilities

Develop and optimize ETL pipelines from various data sources using Databricks on cloud (AWS, Azure, etc.)
Implementing standardized pipelines with automated testing, Airflow scheduling, Azure DevOps for CI/CD, Terraform for infrastructure as code, and Splunk for monitoring
Continuously improve systems through performance enhancements and cost reductions in compute and storage
Utilize Spark Structured Streaming for real-time data processing and integrate data outputs with REST APIs
Lead Data Engineering Projects to manage and implement data-driven communication systems
Coordinate global delivery teams, run scrum ceremonies, manage backlog items, and handle escalations
Integrate data across different systems and platforms
Manage client discussions

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume