A Multi-Institutional Open-Source Benchmark Dataset for Breast Cancer Clinical Decision Support using Synthetic Correlated Diffusion Imaging Data
Recently, a new form of magnetic resonance imaging (MRI) called synthetic correlated diffusion (CDI^s) imaging was introduced and showed considerable promise for clinical decision support for cancers such as prostate cancer when compared to current gold-standard MRI techniques. However, the efficacy for CDI^s for other forms of cancers such as breast cancer has not been as well-explored nor have CDI^s data been previously made publicly available. Motivated to advance efforts in the development of computer-aided clinical decision support for breast cancer using CDI^s, we introduce Cancer-Net BCa, a multi-institutional open-source benchmark dataset of volumetric CDI^s imaging data of breast cancer patients. Cancer-Net BCa contains CDI^s volumetric images from a pre-treatment cohort of 253 patients across ten institutions, along with detailed annotation metadata (the lesion type, genetic subtype, longest diameter on the MRI (MRLD), the Scarff-Bloom-Richardson (SBR) grade, and the post-treatment breast cancer pathologic complete response (pCR) to neoadjuvant chemotherapy). We further examine the demographic and tumour diversity of the Cancer-Net BCa dataset to gain deeper insights into potential biases. Cancer-Net BCa is publicly available as a part of a global open-source initiative dedicated to accelerating advancement in machine learning to aid clinicians in the fight against cancer.
READ FULL TEXT