Skip to main content
The NCI Community Hub will be retiring in May 2024. For more information please visit the NCIHub Retirement Page:https://ncihub.cancer.gov/groups/ncihubshutdown/overview
close

CAFCW 105 Acceleration of Hyperparameter Optimization via Task Parallelism for Information Extraction from Cancer Pathology Reports

By John Gounley1, Hong-Jun Yoon1

1. Oak Ridge National Laboratory

Download (PPTX)

Licensed under

Published on

Abstract

Recent advances in high-performance computing systems for artificial intelligence enable large-scale training of information extraction models from free-form natural language texts. The development of these models is essential to the cancer surveillance research and automation. In this study, we propose an approach to accelerate training of machine learning models by introducing task parallelism. For the information extraction from cancer pathology reports, we implement task parallelism by splitting the task of identifying multiple cancer topography and morphology into several sub-problems. This allows for the hyperparameters for each sub-problem to be optimized in parallel. Further, we introduce the model parameter inheritance to improve the convergence rate of the hyperparameter optimization runs themselves. We evaluate the feasibility of the proposed method on the Summit supercomputer and demonstrate that it improves time-to-solution by a factor of 10, when compared to the traditional model-based optimization algorithm, while maintaining the same level of clinical task performance scores.

Cite this work

Researchers should cite this work as follows:

  • John Gounley; Hong-Jun Yoon (2019), "CAFCW 105 Acceleration of Hyperparameter Optimization via Task Parallelism for Information Extraction from Cancer Pathology Reports," https://ncihub.cancer.gov/resources/2302.

    BibTex | EndNote

Tags