Imaging Data Engineer III

Children’s Hospital of PhiladelphiaPhiladelphia, PA
48d

About The Position

The Center for Data-Driven Discovery in Biomedicine (D3b - d3b.center) seeks an experienced Data Engineer to join a multi-disciplinary team of data scientists, radiologists, engineers, and analysts in the Translational Imaging Research Unit (TIRU) committed to accelerating treatments and cures for children with cancer. The hire will support several multi-institutional healthcare, research, and clinical trial initiatives through operational and database management efforts, under the supervision of a lead data scientist. The hire will be responsible for expanding and optimizing our database of medical imaging data, data warehouse and building data integrations, developing data best practices and governance, performing clinical and administrative reporting and data visualization, as well as optimizing data flow, collection, and reporting for cross-functional teams, including external collaborators. The Data Engineer will be responsible for leveraging existing software and system technologies to facilitate the complete data lifecycle and ensure project delivery, will interact with team members across imaging, technology engineering, and biospecimen/clinical coordinating units at D3b, and will communicate to various stakeholders. The ideal candidate is experienced in all aspects of data from multiple complex sources who enjoys optimizing data systems and building them from the ground up. The Data Engineer III will support our developers, database architects, data analysts and data scientists ensuring optimal data delivery architecture is consistent throughout ongoing projects. They will also support non-technical colleagues in the collection and appropriate use of clinical imaging and non-clinical imaging data. They must be self-directed and comfortable supporting the data needs of multiple teams, systems and products. The right candidate will be excited by the prospect of optimizing or even re-designing our company's data architecture to support our next generation of products and data initiatives. The successful candidate will have experience in code development (Python, RESTful APIs) and management (Github), will communicate effectively, and enable the research and technical teams to transform data into actionable insights. Deep knowledge of, or direct experience with, medical imaging data (DICOM files), database management, cloud computing services (AWS), and handling of “big data” in biomedical contexts is strongly preferred. Experience with continuous integration and continuous deployment is a bonus.

Requirements

  • Bachelor's Degree - Required
  • At least six (6) years Data Engineering/Business Intelligence/Data Warehousing experience - Required and
  • Strong analytic skills related to working with structured and unstructured datasets.
  • Must possess critical thinking and creative problem solving skills along with the ability to communicate well with stakeholders throughout the organization.
  • Strong communication, project management and organizational skills.
  • Highly proficient in SQL
  • Experience with relational SQL and NoSQL databases, including IBM PDA (Netezza), MS SQL Server and HBase.
  • Experience with data integration tools: Informatica, MS Integration Services, Sqoop, etc.
  • Experience with cloud vendors and services: AWS, Google, Microsoft, IBM
  • Experience with stream-processing systems: IBM Streams, Flume, Storm, Spark-Streaming, etc.
  • Experience consuming and building APIs
  • Experience with object-oriented/object function programming languages: Python, Java, C++, Scala, etc.
  • Experience with statistical data analysis tools: R, SAS, SPSS, etc.
  • Experience with visual analytics tools: QlikView, Tableau, Power BI etc.
  • Experience utilizing Agile methodology for development
  • Familiarity with electronic health record and financial systems. i.e. Epic Systems, Cerner, WorkDay, Infor, Strata etc.
  • Working knowledge of message queuing, stream processing, and highly scalable data stores.

Nice To Haves

  • Master's Degree Computer Science, Informatics, Information Systems or another quantitative field - Preferred
  • At least six (6) years Data Engineering/Business Intelligence/Data Warehousing experience in a healthcare environment - Preferred or
  • At least ten (10) years Data Engineer role - Preferred
  • Imaging (DICOM) or other biomedical/life sciences data
  • Programming experience (Python strongly preferred) and use of APIs and RESTful web services
  • Tableau, R-Shiny, or similar data visualization tools
  • Database design and management, relational databases such as Postgres or MySQL preferred
  • Working knowledge of GitHub
  • Cloud platforms (such as Amazon Web Services and Google Cloud Platform)

Responsibilities

  • Assemble and maintain large, complex imaging data sets that meet clinical research requirements. Validate data to ensure quality.
  • Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
  • Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources (including ground, hybrid cloud, and cloud) using SQL and various programming technologies.
  • Data Integration & Modeling – evaluate structured and unstructured data, incorporate new data, determine the most appropriate schema for new fact tables, data marts, etc. while maintaining enterprise best practices and adhering to data governance standards.
  • Reporting – collaborate with colleagues across the enterprise to scope requests. Extract data from various data sources, validate results, create relevant data visualizations, and share with requester. Develop dashboards and automate refreshes as appropriate.
  • Governance / Best Practices – adhere and contribute to enterprise data governance standards. Also educates and supports colleagues in best practices to ensure that data is used appropriately.
  • Product Ownership – collaborate and act as the voice of the customer to offer concrete feedback and project requests as well as an advocate for analytics from within the business units themselves.
  • Develop analytics tools that utilize data resources to provide actionable insights, operational efficiency and other key business performance metrics.
  • Work with stakeholders including the Executive, Clinical, and Analyst teams to assist with data-related technical issues and support their data infrastructure needs.
  • Develop optimized tools for analytics and data scientist team members that assist them in building and optimizing projects into an innovative industry leader.
  • Proficient at integrating predictive and prescriptive models into applications and processes.
  • Build processes supporting data transformation, data structures, metadata, dependency and workload management.
  • Make recommendations about platform adoption, including technology integrations, application servers, libraries, and frameworks.
  • Be a critical part of an Agile Scrum software development team, ensuring the team successfully meets its deliverables each sprint.
  • Performing root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement.
  • Supporting and working with cross-functional teams in a dynamic environment.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service