“DCPHR allows Penn State researchers to be part of large multi-site studies in ways that were not previously possible,” said Avnish Katoch, research informatics project manager with Penn State CTSI.
Now, Penn State researchers can perform multi-site studies in collaboration with OHDSI. Penn State researchers interested in accessing the OHDSI community or developing proposals for multi-site studies should request a free informatics consultation.
In addition, CTSI’s Informatics Core can assist with study design, including use of AI/ML. The Informatics Core empowers researchers in several ways, including the following:
- help with study design and feasibility analysis;
- help with cohort definition and data extraction;
- support for data preparation;
- support for analysis of large data sets (characterization, prediction, effect estimation);
- support for model interpretation, interrogation, deployment, inference; and
- AI/ML support for research proposal development
As a proof of concept of the OHDSI community, Penn State participated in Project HERA - Health Equity Research Assessment in which investigators looked to characterize health and healthcare disparities across different groups, outcomes and databases/countries. Investigators used HERA to ask: Are there systematic patterns of diagnosis coding prevalence for Black and white patients across a network of observational health datasets and across all diagnoses? A publication of this study is currently underway.
“During the past year, the Informatics team completed testing the Penn State instance of OMOP-based EHR data repository, set up processes for its periodic refresh, assessed data for quality and completeness, and identified steps to improve data quality. The data repository recently transitioned from the test environment to the production environment, allowing us to open it up for use by the larger biomedical data sciences and clinical research communities at Penn State,” said Honavar. “The next milestone for DCPHR is to support the integration of EHR data with other data sets or individual level socio-demographic data, deidentification of the integrated data, and provision of AI/ML workflows for analyses of multi-modal health data,” he added.
Other informatics data information
Penn State CTSI informatics core provides access to Electronic Health Records (EHR) data from TrinetX, which includes 80+ institutional partners of the TrinetX research network. The TrinetX platform supports basic statistical analyses. Trinetx is better suited for preliminary analyses of large EHR datasets. Basic statistical characterization of TrinetX EHR data can be carried out using this platform whereas more extensive analyses, e.g., using machine learning, require retrieving the relevant data and running it through AI/ML pipelines (often with assistance from the CTSI Informatics Core’s data science team).
The CTSI Informatics Core
Artificial intelligence and machine learning are necessary for researchers who are interacting with large data sets. However, it can be challenging to understand how to best access and interface with these giant databases. Many research groups at Penn State are working through the CTSI Informatics Core to leverage data science methods to advance their work.
For more information on how the CTSI Informatics Core works, watch this replay of “Harnessing the Power of EHR Data and IA to Advance Biomedical Research,” which includes how and why current research groups have applied artificial intelligence to their research, and offers examples of how the computational consulting team can support Penn State researchers' data science projects.