Home > Research and Datasets > Data Extraction
DARTNet is a federated network of electronic health record data and other clinical information from multiple organizations across the United States. A federated network is a collection of databases that reside in multiple member practices and that are linked through a secure Web-based system so they can be searched and queried as one large database while maintaining privacy and confidentiality of patient data.
In the first model, data from each member organizations' electronic health record or clinical data warehouse are extracted into an XML file or a set of flat files that conform to a slightly modified version of Observational Medical Outcomes Project (OMOP) V4 Common Data Model. The XML or flat files are loaded into the ROSITA system. The ROSITA system serves several functions:
Data are queried via a secure web portal to be used for research studies and quality improvement activities. Permission from each practice is required each and every time to make data available to DARTNet.
In the second model, clinics use third-party clinical decision support vendors to handle data extraction and standardization. In this model, data from each member practice's EHR are captured, de-identified, coded, standardized and stored in a database which resides at each individual practice. These vendors also include other important data sources such as billing, lab, hospital, and prescription databases in their secondary databases. To link the different databases, The member clinics authorize the third-party CDS vendors to generate the XML or flat files on their behalf for loading into ROSITA, or authorize them to transfer limited data sets for specific studies directly to the DI research team.
In the third model of data extraction, programmers at the clinic site collaborate directly with DI staff to pull a limited data set from the clinic's systems for specific studies. The extracted data can be in nearly any format as long as it can be transformed by DI analysts into the OMOP V4 CDM. In this model, DI analysts programmatically aggregate and standardize limited data sets from each clinic, essentially performing the functions of ROSITA and the web-portal. This model provides a means of participation in DI studies for clinic sites that cannot extract data in the exact XML format, or that do not wish to host a grid node.