Data Engineer

Sutter HealthWalnut Creek, CA
10hHybrid

About The Position

Responsible for developing Sutter Health’s research analytic data infrastructure. This includes all aspects of how data is ingested, stored, collected, governed, cleansed, accessed, and used. It includes making sure that the data used by the organization is of the highest quality and is made available as soon as possible in a format that allows the business (researchers) to make critical decisions based on the data. Utilizes tools and infrastructure such as scalable data pipelines to manage high volume and high-speed data storage and retrieval, as well as automated testing and tools for improving data quality. Works with all types of data including batch and streaming data, structured, semi-structured and unstructured data, files, web downloads, and other sources of data. Creates and improves processes required by other data-dependent function including analytics, strategic business intelligence, and data science. Uses state-of-the-art methods to capture, route and store data, combining information from different sources, transforming it to improve the data’s reliability, quality and usability. Develops and tests new architectures that enable data extraction, automation, and modeling for predictive or prescriptive analytic purposes. Sets the standard for high-value high quality datasets that are accurate, timely, secure and well-suited to strategic analytic purposes research organization. Work on IRB approved research studies providing accurate and timely curated data. Work closely with Principal investigators and statisticians. Work in accordance with Research Privacy and HIPAA regulations and methods for safeguarding PHI and PII. This position is part of a new, exciting strategic initiative within Sutter Health Research. While this role is designated as limited-term, there is a strong possibility that qualified incumbents may be considered for extensions based on organizational needs and performance. This is a hybrid role with both work from home and onsite requirements. The successful candidate will live in the Sutter footprint.

Requirements

  • Bachelor’s degree in Computer Science, Engineering, Information Management, or Healthcare Administration
  • 8 years recent relevant experience.
  • Experience creating data pipelines on big data platforms and data integrations in databases and data lakes, working with various cloud and on-premises technologies.
  • Experience leveraging scalable data platforms to build secure infrastructure; experience building batch or streaming data ingestion pipelines.
  • Ability to assess and profile raw data and reassemble raw data from multiple sources into a single, enterprise model.
  • Hands on experience with data management tools (Cloudera, Spark, Python, Databricks, etc.); fluency with SQL programming, scripting, and data architecture.
  • Extensive familiarity with relational database concepts / technologies (SQL, Oracle, etc.) including data design, table design, partitioning, as well as determining the technology to use in any given scenario.
  • Experience ensuring data quality and implementing tools and frameworks for automating identification of data quality issues.
  • Strong understanding of data engineering and data traceability best practices and framework
  • Ability to work in a consulting role, building technology and communicating with end-users and customers of varying levels of technical capability.
  • Ability to produce high-quality, professional documentation and communication materials.
  • Strong knowledge in the development of Business Intelligence and Reporting solutions.
  • Ability to translate data into Management reports and presentations.
  • Strong problem solving, organization, and prioritization skills.
  • Detail-oriented, producing timely results and ability to work both independently with minimal supervision and as a member of a scrum/product team.
  • Track-record of successful project delivery, building collaborative cross-functional relationships, and an ability to find creative ways to solve business problems.
  • Ability to balance the competing needs of multiple priorities and work in a dynamic environment; ability to perform under pressure and in stressful situations.
  • Demonstrable capacity for learning technical concepts and adapting to new technologies quickly; ability to stay current with evolving best practices in data management.
  • Familiar with healthcare provider data structures and sources; experienced with HIPAA regulations and methods for safeguarding PHI and PII through mitigation of data exposure risk.
  • Knowledge of health care operations and structure, general requirements in an integrated delivery.
  • The role requires significant technical skills, including multiple programming languages and extensive knowledge of (T-SQL, ANSI SQL), as well as experience with modern, distributed, scalable data platform.

Responsibilities

  • Work on IRB approved research studies providing accurate and timely curated data.
  • Work closely with Principal investigators and statisticians.
  • Work in accordance with Research Privacy and HIPAA regulations and methods for safeguarding PHI and PII.
  • Build tables and analytical datasets for the research team by partnering with the Principal Investigator and statistician for IRB approved studies.
  • Propose, develop and implement algorithms for investigating and cleaning data collected from healthcare systems.
  • Work with Research Privacy and Security staff as necessary
  • Write clear and logical documentation on the sources and methodologies used to extract and transform the information for the researchers, other analysts and programming staff.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service