A Deep Learning Approach for Overall Survival Prediction in Lung Cancer with Missing Values
One of the most challenging fields where Artificial Intelligence (AI) can be applied is lung cancer research, specifically non-small cell lung cancer (NSCLC). In particular, overall survival (OS), the time between diagnosis and death, is a vital indicator of patient status, enabling tailored treatment and improved OS rates. In this analysis, there are two challenges to take into account. First, few studies effectively exploit the information available from each patient, leveraging both uncensored (i.e., dead) and censored (i.e., survivors) patients, considering also the events' time. Second, the handling of incomplete data is a common issue in the medical field. This problem is typically tackled through the use of imputation methods. Our objective is to present an AI model able to overcome these limits, effectively learning from both censored and uncensored patients and their available features, for the prediction of OS for NSCLC patients. We present a novel approach to survival analysis with missing values in the context of NSCLC, which exploits the strengths of the transformer architecture to account only for available features without requiring any imputation strategy. By making use of ad-hoc losses for OS, it is able to account for both censored and uncensored patients, as well as changes in risks over time. We compared our method with state-of-the-art models for survival analysis coupled with different imputation strategies. We evaluated the results obtained over a period of 6 years using different time granularities obtaining a Ct-index, a time-dependent variant of the C-index, of 71.97, 77.58 and 80.72 for time units of 1 month, 1 year and 2 years, respectively, outperforming all state-of-the-art methods regardless of the imputation method used.
READ FULL TEXT