Data Engineer Senior

HealthPartners/GHIBloomington, MN
1d

About The Position

HealthPartners is hiring a Data Engineer Senior. Our mission is to provide simple and affordable healthcare. HealthPartners teams use data to improve patient and member experience, improve health, and reduce the per capita cost of health care. HealthPartners data engineers are responsible for building, managing and optimizing data pipelines that facilitate data movement in service of these goals by implementing and testing methods (or build systems) that improve data reliability and quality. Data engineers work in collaborative scrum teams with other developers, analysts and data scientists and may share work efforts in order to achieve sprint goals. They champion and embrace leading practices in the field, and develop processes to effectively store, manage and deliver data. As part of their role, data engineers are responsible for reducing manual data work and improving productivity; they employ and test innovative tools, techniques and architectures to detect patterns and automate common or repetitive data preparation and integration tasks.

Requirements

  • Bachelor’s degree in computer science, data or social science, operations research, statistics, applied mathematics, econometrics, or a related quantitative field AND 4+ years of experience in business analytics, data science, software development, data modeling and/or data engineering work OR Master's Degree in Computer Science, Math, Software is acceptable
  • 5+ years of programming experience with command in at least two languages: SQL, Python, Java, R or Spark
  • Expert proficiency in SQL; experience with Oracle, PostgreSQL, MySQL, or Microsoft SQL Server
  • 4+ years of experience in at least one of the BI/Analytics space such as Azure Data Factory, Synapse, Data Explorer, Cosmos, App Insights, Power BI, and Databricks (or similar)
  • 3+ years of experience in the areas of Graph databases, Big Data, or Hyper Scale products in Azure.
  • Experience using Git, or a comparable code repository
  • Must be motivated, self-driven, curious, and creative
  • Must be skilled communicator, and demonstrate an ability to work with end users and business leaders
  • Demonstrate the ability to support and complement the work of a diverse development and/or operations team
  • Experience working in a collaborative, cross-functional team; Active participation in sprint reviews
  • Expert Knowledge and a minimum of four years’ experience in the following domains: Data management, software engineering, I&O (infrastructure and operations)

Nice To Haves

  • Pharmacy data experience
  • Knowledge of health care operations
  • Exposure to agile/scrum
  • Ability to work in a hybrid cloud environment consisting of on premise and public cloud infrastructure. An ideal candidate will have experience with one or more of the following skill sets
  • Expert in Relational databases like Oracle, SQL server
  • Expert in Optimizing and tuning SQL/Oracle queries, stored procedures and triggers
  • Expert in Python (numpy, pandas, matplotlib etc) and Jupyter notebooks for exploratory data analysis, machine learning, and process automation
  • Expert in areas of CI/CD, Continuous testing, and site reliability engineering.
  • Expert in Microsoft Azure applications such as Azure Data Factory, Synapse, Purview, Databricks /Spark, Power BI, PowerApps.
  • Expert in event streaming tools like NiFi, Kafka and Flink
  • Expert in data processing tools, like Apache’s Sqoop, Spark and Hive
  • Expert in Document or NoSQL datastores, particularly MongoDB
  • Expert in Power BI data models using advanced Power Query and DAX
  • Expert in in AI/ML Ops
  • Interest and desire to contribute to emerging practices around DataOps (CI/CD, IaC, configuration management, etc.)
  • Collaborate effectively with product management, program management, engineers, and stakeholders.
  • Excellent analytical and critical thinking skills
  • Ability to influence without authority and thrive in an ambiguous environment.

Responsibilities

  • Work with stakeholders, data scientists and analysts to frame problems, clean and integrate data, and determine the best way to provision that data on demand
  • Collaborate with other developers to design technology solutions that achieve measurable results at scale
  • Collaborate with data scientists to train, develop and operationalize learning models
  • Build, design and develop scalable, efficient data pipeline processes to handle data ingestion, cleansing, transformation, integration, and validation required to provide access to prepared data sets for analysts and data scientists
  • Evaluate and recommend technology and frameworks for building cross product data assets to optimize for flexibility, long-term viability, and time to market
  • Guide, mentor, influence, and adopt a cloud-first modern data architectural direction, and consistently adopt the associated standards and best practices
  • Anticipate the need for data governance and work with designated committees to ensure that data modeling and data handling procedures are compliant with applicable laws and policies across data pipeline.
  • Leads root cause analysis in response to detected problems/anomalies to identify the reason for alerts and implement advanced solutions that prevent recurring points of failure.
  • Applies in-depth knowledge of the business to design a data model that is appropriate for the project and translates business requirements into design specification documents to model the flow and storage of data across multiple data pipelines.
  • Identifies multiple, complex data sources and builds advanced code to extract raw data from identified upstream sources using query languages, tools, or machine learning algorithms, while assuring accuracy, validity, and reliability of the data across the pipeline.
  • Formulate techniques for quality data collection to ensure adequacy, accuracy and legitimacy of data.
  • Perform unit tests and conduct reviews with other team members to make sure your code is rigorously designed, elegantly coded, and effectively tuned for performance
  • Participate in code reviews and champion the adoption of best practices
  • Support the data environment by releasing new features, resolving issues for users, and working with other technology teams
  • Monitor and analyze information and data systems and evaluate their performance to discover ways of enhancing them (such as new technologies and upgrades).
  • Applies in-depth knowledge of the business to design a data model that is appropriate for the project and translates business requirements into design specification documents to model the flow and storage of data across multiple data pipelines.
  • Data engineers perform other duties as required, to meet team sprint goals
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service