Open Source Clinical Enterprise Data Warehouse (EDW) Data Browser (Leaf)

CD2H Phase 2 Proposal Template

Project Title:  Open Source Clinical Enterprise Data Warehouse (EDW) Data Browser (Leaf)

Point Person: Nic Dobbins, UW, ndobb@uw.edu

 

Elevator pitch:

To date, cohort discovery tools have been costly for institutions to instantiate, requiring multiple FTE to extract and force clinical data to conform to certain data models, along with complex database and server deployments requiring ongoing support. Leaf is a lightweight web application for cohort discovery that is data model agnostic and aims for be fun and easy to use. At the University of Washington where Leaf was developed, Leaf has been successfully deployed as a production-supported tool with hundreds of users. In the next phase of CD2H Leaf development, we would like to begin working with other CTSAs in deploying Leaf to query their repositories, and further develop the tool to meet a greater number of research needs. Of note, Leaf is able to query multiple partnered repositories in parallel, can be deployed in the cloud, and we are exploring future support for queries via FHIR.

 

Project history:

All Leaf Cloud Pilot deliverables (including proposed phase 2) are tracked here: https://app.smartsheet.com/b/home   Cloud Demonstration of Data EDW/EHR Sharing section.

 

Yes, this project builds off of the cloud data pilot, which accomplished the following outputs:

  • UW relationship built with AWS and synthetic data environment established
  • Adapted Leaf to query OMOP and i2b2
  • Planning for FHIR adapter began
  • Development started in 2016, in production at UW, pilots initiated at Wash U and JHU
  • Compliance, Human Subjects, business reviews in academic health system
  • Governance is underway to use Leaf cross-institutionally between UW, Wash U, and UW regional partners in Data QUEST (DUA and IRB currently under review)
  • Developed data sharing and Leaf use case, IRB and data model with Wash U, UW and Data QUEST
  • Presentations:
    • Presented to Harvard, Stanford, Seattle Children’s, JHU, AWS, Rocky (12/7), ACT (12/?), CTSA PI meeting, ???

 

GitHub repo:

 

Project description:

Academic medical centers and health systems are increasingly challenged with supporting appropriate secondary uses of data from a multitude of sources. To that end, the UW Medicine Enterprise Data Warehouse (EDW) has emerged as a central port for all data that can include clinical, research, administrative, financial and other data types. Although EDW’s have been popular and successful in providing a single stop for data, they are often non-self service and require an informatician or clinical informatics expert to access.

 

To address this challenge, we have developed an easy to use, self service web-based tool for querying, browsing and extracting clinical cohorts from the UW Medicine EDW, called Leaf.  Leaf enables querying by data dictionaries or ontologies and allows both de-identified and identified access to patient data and grants access to these datasets in a compliant manner. While Leaf provides basic visualizations, it contains robust tools for exporting directly to REDCap projects. The users of Leaf include both quality improvement and research investigators and has been developed using an Agile development process with a soft production rollout to identify and address software, support and data quality concerns.  

 

Proposed Solution:

We are engineering Leaf to be broadly accessible by CTSAs as a data model – independent software tool that acts as a ‘side car’ to a data warehouse or data mart. Leaf currently supports both the i2b2 and OMOP based data models.  We are proposing to make Leaf an open source fully functional EDW browser solution, including FHIR ready capabilities for querying multiple repositories securely. Leaf is not a competitor to i2b2 in that we do not require creating a separate data mart and can support i2b2 based data models.

 

Benefit:

CTSAs have developed substantial infrastructure for data sharing.  Leaf enables self service access to those repositories, extending severely human bottlenecked services for access to EHR data.

 

Expected outputs (6 months):

  • The Leaf software made available with an “open source” license to CTSAs, with supporting documentation for installation/configuration
  • The Leaf software working against live UW, Wash U, and Data QUEST data as demonstrations of cross-institutional data querying capability, enabled in the cloud