Denny and others have developed tools which facilitate data extraction, such as KnowledgeMap for natural language processing of clinical text and “phecodes” for phenotypic restructuring and harmonization from EHR billing codes. Therefore, another meaningful component of this laboratory's mission is to extract practical information from the EHRs in a systematic and unbiased fashion. While there is an overwhelming amount of information available in EHRs, typically this data is unstructured and common difficulties arise associated with data availability, missingness, and inconsistency. EHRs offer a unique chance to evaluate a multitude of health outcomes including complex human disease, response to medication, clinical characteristics, and environmental influences impacting patient health for association with genetic factors. In conjunction with these data sources and growing research partnerships, the laboratory aims to identify features that track with behavioral health traits (e.g., activity, sleep, imaging, etc.) and build novel phenotypes (e.g., predicted suicide risk, predicted carrier of genetic variant, etc.). The nature of this research is based on highly imbalanced data with rare outcomes and events. Their work accounts for complex interactions between a highly dimensional feature space through statistical and artificial intelligence (AI)/machine learning algorithms designed to process this complexity.Īdditionally, members of the laboratory have incorporated dense, temporal data in intensive care environments and sparse, sporadically collected outpatient data on diverse and heterogeneous study populations. Denny's group is to analyze complex datasets incorporating hundreds of thousands of predictors and up to millions of subjects. These types of approaches, combined with biomedical and functional genomic informatics resources as well as innovative statistical modeling techniques, can elucidate genomic architecture of disease, common biological mechanisms underlying disease development and progression, and clinically relevant therapeutic targets.Ĭurrently, the laboratory harnesses data from large scale biorepositories such as the eMERGE (Electronic Medical Records and Genetics) Network, BioVU, UK Biobank, Million Veteran Program (MVP), and All of Us. GWAS evaluates the association of millions of genetic variants with a particular disease while PheWAS examines the range of diseases associated with a particular genetic variant (or other analyte) to identify potentially pleiotropic relationships. In light of these facts, EHRs have become a powerful resource to illuminate shared genetic architecture across diseases through the development of genome-wide association studies (GWAS), phenome-wide association studies (PheWAS), transcriptome wide association studies (TWAS) and colocalization analyses, pharmacogenomic investigations, polygenic risk scores (PRS), phenotype risk scores (PheRS), Mendelian randomization (MR), biogeographic ancestry modeling, and exploration of the impact of rare disease variants. Over the years, studies have demonstrated that genetic analyses evaluating EHRs typically have larger sample sizes, are more cost-effective, and provide more opportunities for broad-ranging longitudinal investigations. Electronic Health Records (EHRs), which typically include hospital billing codes, laboratory and vital signs, provider documentation, reports and tests, and medication records, are the major source of data for the laboratory. The Precision Health Informatics Section relies on large scale data and bioinformatics approaches to more effectively characterize and identify genetic diseases. We partner with researchers to create trans-initiative resources cataloguing genetic associations across phenotypes. With this mission in mind, those in the Precision Health Informatics Section primarily repurpose health records as a source of longitudinal phenotype information and use links to genomic and other data. Denny's laboratory seeks to discover gene-disease relationships by gathering, assessing, and analyzing the human phenome across genomics and environmental exposures.