Large-Scale Biomedical Discovery through HPC
Enabling scientific questions to be addressed at unprecedented scale through high-performance computing and population-level biomedical datasets
My research has significantly advanced the use of high-performance computing (HPC) for large-scale biomedical data analysis, enabling scientific questions to be addressed at unprecedented scale. I led and managed multiple national-scale efforts that integrated supercomputing resources with population-level biomedical datasets, including the $27M VA–DOE MVP-CHAMPION collaboration and earlier NIH-funded infrastructure projects.
Key Achievements
Using DOE leadership-class supercomputers and data from the VA Million Veteran Program, my team enabled some of the largest genome-wide and phenome-wide association studies conducted to date, including large-scale PheWAS analyses that linked genetic variation to thousands of clinical phenotypes.
These efforts demonstrated how tightly coupled HPC, advanced workflows, and secure data environments can transform population genomics and precision medicine research.
Impact
This work established reproducible computational pipelines and scalable analysis strategies that are now broadly applicable to other national biobank efforts and biomedical research programs.
Related Projects
- VA-DOE MVP-CHAMPION ($27M)
- Exascale Genomics Analysis Toolkit (ExaGATK)
- Large-scale genomic imputation workflows