MVP data available for research
Through centralized data collection, cleaning, and curation, MVP has a wealth of health records, self-reported surveys, and genetic data available for research, with the generation of other omics data underway. MVP researchers contribute to the curation of phenotypes, and all MVP phenotype definitions are stored in the Centralized Interactive Phenomics Resource (CIPHER), a publicly accessible phenotype knowledgebase. Note: CIPHER does not contain patient-level data. CIPHER stores algorithms (which are instructions or “recipes”) for using MVP data to define health conditions.
Page Content
Applying for access to MVP data
- MVP genomic and phenotypic data is available to VA researchers through VA-funded research projects and select non-VA federal funding.
- While opportunities for accessing MVP data are evolving, access is currently limited to VA-affiliated researchers.
- VA-affiliated researchers can submit proposals in response to RFAs from our ORD services: RFAs and Program Announcements (va.gov). VA researchers can also apply for select types of non-VA federal funding.
Exploring MVP data
The following resources allow researchers to explore MVP data while planning and developing research proposals. Some resources may only be accessible to VA users.
- The VA MVP Data Explorer tool enables researchers to query data based on clinical and other data characteristics to build rough cohorts, estimate sample sizes, and perform power analyses.
- This tool is accessible to VA users with a NT account to help explore MVP data while planning and developing research proposals.
- The Centralized Interactive Phenomics Resource (CIPHER) is an online knowledge sharing platform that aims to optimize electronic health record (EHR) data for use in research and clinical operations.
- CIPHER’s standardized, searchable platform is a public resource that optimizes phenotype reproducibility, consistency, and scalability across health systems. Phenotype definitions contributed by MVP investigators can be found by browsing the CIPHER knowledgebase.
- To learn more about CIPHER, please refer to our manuscript describing the platform.
- Summary results for many of MVP’s projects are available via the MVP accession phs001672.v11.p1 in the Database of Genotypes and Phenotypes (dbGaP). Access individual SNP lookups or fill out a short application to dbGaP to download full summary statistics here.
- View publications from MVP research studies in PubMed.
Page Content
MVP analytics environments and tools
MVP provides the following centralized analytics environments and tools to support researchers in their studies. Researchers can also bring approved tools into the MVP analytics environment. The most commonly used analysis software and programming languages are available in the computational environments and updated regularly. New tools and software can be added upon request and approval.
Analytical environments
Genomic Information System for Integrative Science (GenISIS)
- The Genomic Information System for Integrative Science (GenISIS) is a high-performance computing cluster (HPC) that approved MVP researchers access to analyze MVP genetic data.
- In addition to 2,354 cores for analysis, it contains >6.3 PB of storage. Access from GenISIS to the VA enterprise cloud (VAEC) is currently being tested and will be available for MVP research in the future.
- Access to this analytical environment is only available to VA system users.
VA Informatics and Computing Infrastructure (VINCI)
- The VA Informatics and Computing Infrastructure (VINCI) is a Health Services Research & Development Resource Center that provides researchers a nationwide view of high-value VA patient data.
- VINCI is a research and development partnership and operational platform for health services research, epidemiology, decision support, and business intelligence. All MVP awards use this system for controlled access and analysis of MVP clinical and survey data.
- Access to this analytical environment is only available to VA system users.
MVP-CHAMPION (Computational Health Analytics for Medical Precision to Improve Outcomes Now)
- MVP-CHAMPION is a partnership between the VA and Department of Energy (DOE) formed in 2016 to combine VA health data, MVP data, computing resources, and artificial intelligence/machine learning (AI/ML) technologies to drive precision medicine research. This environment is currently only available for Program-directed ORD projects.
- DOE’s Oak Ridge National Laboratory (ORNL) provides a computing environment accessible to both VA and DOE researchers for conducting research and analyzing MVP data.
- ORNL established a secure computing enclave for the VA called the Knowledge and Discovery Infrastructure (KDI), where copies of the VA CDW data and MVP genetic and survey data are stored. MVP and DOE researchers involved in the MVP-CHAMPION projects are using this environment to advance the frontiers of precision medicine. Its high-performance computation is beginning to expand research methods that will help improve their understanding of issues important to Veterans' health.
VA Data Commons (coming soon)
- The VA Data Commons will bring together de-identified VA and MVP data, and computational and scientific tools into one common platform where VA and non-VA researchers can obtain research access based on Veterans’ data-sharing preferences.