Welcome
to Vivek Kale's Homepage.
|
I am a Principal Member of Technical Staff at Sandia National Laboratories - California. I primarily work on research and development for HPC Software Technology. This HPC Software Technology enables Scientific Software relevant to the U.S. Department of Energy to run efficiently on supercomputing platforms having nodes with heterogenous devices, including GPUs. I am also interested in AI/ML for automated performance tuning and automated testing in HPC (including modern generative AI techniques), AI/ML-guided Scientific Software and hybrid supercomputer-cloud platforms.
Additionally, my work can be used to optimize AI/ML workloads on modern data center platforms having GPUs. Key topics in my work are Kokkos, C++ Parallel STL, C++ Executors, OpenMP, MPI, loop scheduling and loop transformations, LLVM, adaptive runtime systems, tools for profiling and debugging, and performance modeling.
I completed my PhD in Computer Science in May 2015 from the University of Illinois at Urbana-Champaign. During this time, I also worked with and held positions with the U.S. Department of Energy laboratories, particularly at Lawrence Livermore National Laboratory and Argonne National Laboratory.
Since earning my doctorate, I continued research on combining work on loop scheduling and inter-node load balancing, making loop scheduling and load balancing work synergistically together to improve performance of scientific and engineering simulations run on supercomputers.
A highlight of my work is the development of multi-GPU programming capabilities in LLVM's OpenMP, including mechanisms for maintaining data locality when scheduling computation and data to multiple GPUs. This is increasingly important for scaling HPC applications involving AI/ML, bioinformatics, drug discovery on multi-GPU platforms such as NVIDIA's DGX.
Representative Work
Vivek Kale, Hanru Yan, Shyamali Mukherjee, Jackson Mayo, Keita Teranishi, Richard Rutledge and Alessandro Orso.
Toward Automated Detection of Portability Bugs in Kokkos Parallel Programs 8th International Workshop on Software Correctness for HPC Applications, SC24. November 18, 2024. [paper]
[code]
Mathialakan Thavappiragasasm and Vivek Kale.
CPU-GPU Performance Tuning for Improving Performance of Modern Scientific Applications on Exascale Supercomputers. IEEE's International Conference on High-Performance Computing (HiPC) 2023. Goa, India. December 18-21, 2023. [paper]
[code]
Shravan Kale, Kevin Huck, David Boehme, Vanessa Surjadidjaja, and Vivek Kale.
Performance Analysis and Auto-tuning Tools for Performance Portable Parallel Programs. 2023 ACM/IEEE International Conference for High Performance Computing Networking, Storage, and Analysis. Denver, CO, USA. November 12-17, 2023. [paper]
[code]
Vivek Kale, Vanessa Surjadidjaja, Christian Trott, and James Brandt.
Data Order Reduction for Performance Monitoring of Supercomputers via the Kokkos Tools Sampler Utility. LDMSCon 2023. Boston, MA, USA. June 13-15, 2023. [paper] [code]
Vivek Kale and Shyamali Mukherjee.
Tools to Rapidly Develop Sophisticated HPC Software Libraries. SIAM Computational Science and Engineering Conference 2023. Amsterdam, Netherlands. March 2, 2023. [paper]
[code]
Mathialakan Thavappiragasam and Vivek Kale.
OpenMP’s Asynchronous Offloading for All-pairs Shortest Path Graph Algorithms on GPUs. HiPar 2022 Workshop at The 2022 International Conference for High Performance Computing Networking, Storage, and Analysis. November 16, 2022. Dallas, Texas, USA. [paper ][code]
Mathialakan Thavappiragasam, Vivek Kale, Oscar Hernandez and Ada Sedova.
Addressing Load Imbalance in Bioinformatics and Biomedical Applications: Efficient Scheduling across Multiple GPUs. In Proceedings of 12th International Workshop on High Performance Bioinformatics and Biomedicine. December 9th, 2021. Houston, Texas (virtual). [paper]
[code]
Raul Torres, Vivek Kale, Abid Malik, Tom Scogland, Roger Ferrer and Barbara M. Chapman.
Support in OpenMP for Multi-GPU Parallelism. The International Conference for High Performance Computing Networking, Storage, and Analysis. Extended Abstract and Poster. November 19, 2021. St. Louis, Missouri, USA. [paper][code]
Seonmyeong Bak, Colleen Bertoni, Swen Boehm, Reuben Budiardja, Barbara M. Chapman, Johannes Doerfert, Markus Eisenbach, Hal Finkel, Oscar Hernandez, Joseph Huber, Shintaro Iwasaki, Vivek Kale, Paul R.C. Kent, JaeHyuk Kwack, Meifeng Lin, Piotr Luszczek, Ye Luo, Buu Pham and P.K. Yeung.
OpenMP Application Experiences: Porting to Accelerated Nodes. In Journal of Parallel Computing. October 23rd, 2021. [paper]
Vivek Kale, Wenbin Lu, Anthony Curtis, Abid Malik, Barbara Chapman and Oscar Hernandez.
Toward Supporting MultiGPU targets via taskloop and User-defined Schedules Proceedings of the 2020 International Workshop of OpenMP. September 23-25, 2020. Austin, USA. (virtual) [paper]
[code]
Jonas H. Muller Korndorfer, Florina Ciorba, Christian Iwainsky, Johannes Doerfert, Hal Finkel, Vivek Kale and Michael Klemm.
A Runtime Approach for Dynamic Load Balancing of OpenMP Parallel Loops in LLVM. Extended Abstract (Poster). Proceedings of the 2019 ACM/IEEE International Conference for High Performance Computing, Networking, Storage, and Analysis. [paper]
Vivek Kale, Christian Iwainsky, Michael Klemm, Jonas H. Muller Kondorfer and Florina Ciorba. Toward a Standard Interface for User-defined Scheduling in OpenMP. Fifteenth International Workshop on OpenMP. September 2019. Auckland, New Zealand.
[paper]
Vivek Kale and Oscar Hernandez. User-defined Schedules in OpenMP for Improved Performance Portability. Department of Energy Performance, Portability, and Productivity Workshop. Poster. April 2019. Denver, USA. [paper][code]
Vivek Kale and Martin Kong. Locality-aware Loop Scheduling Strategies in OpenMP Extended Abstract. OpenMPCon 2018. September 2018. Barcelona, Spain. [paper]
Vivek
Kale, Harshitha Menon, Karthik
Senthil. Adaptive Loop Scheduling with Charm++
to Improve Performance of Scientific
Applications. SC 2017
Poster. Denver, USA. November 2017. (Selected
as a Candidate for the Best Poster) [pdf]
Vivek
Kale and William D. Gropp. A User-defined
Schedule for OpenMP. OpenMPCon 2017.
September
2017. New York,
USA. [paper]
Vivek
Kale and William D. Gropp. Composing
Low-overhead Scheduling Strategies for Improving
Performance of Scientific Applications.
IWOMP 2015. October 2015. Aachen,
Germany. [paper]
Simplice
Donfack, Laura Grigori, William D. Gropp, Vivek
Kale. Hybrid Static/Dynamic Scheduling for
Already Optimized
Dense Matrix Factorization. IPDPS 2012. May 2012. Shanghai,
China. [paper]
Vivek
Kale, Todd Gamblin, Torsten Hoefler, Bronis R. de
Supinski, William D. Gropp. Slack-conscious
Lightweight Loop Scheduling for Scaling Past the
Noise Amplification Problem. SC 2012
Poster. November
2012. Salt
Lake City,
USA. [pdf]
Torsten
Hoefler, James
Dinan, Darius
Buntinas, Pavan
Balaji, Brian
Barrett, Ron
Brightwell, William
Gropp, Vivek
Kale, Rajeev
Thakur. MPI+MPI: A
New Hybrid Approach to Parallel Programming with MPI Plus Shared
Memory. EuroMPI 2012. September 2012. Madrid,
Spain. [paper]
Vivek
Kale and William Gropp. Load Balancing Regular Meshes on
SMPs with MPI. EuroMPI 2010. September 2010. Stuttgart,
Germany. (Selected
as a Best
Paper) [pdf] Vivek
Kale and Edgar Solomonik. Parallel
Sorting
Pattern. ParaPLoP
2010. March 2010. Carefree,
USA. [pdf]
This
page was last updated
on October 17,
2024.
|