Four-Dimensional Machine Learning Radiomics for the Pretreatment Assessment of Breast Cancer Pathologic Complete Response to Neoadjuvant Chemotherapy in Dynamic Contrast-Enhanced MRI

Journal of Magnetic Resonance Imaging

M. Caballo, W. Sanderink, L. Han, Y. Gao, A. Athanasiou and R. Mann


Breast cancer response to neoadjuvant chemotherapy (NAC) is typically evaluated through the assessment of tumor size reduction after a few cycles of NAC. In case of treatment ineffectiveness, this results in the patient suffering potentially severe secondary effects without achieving any actual benefit.


To identify patients achieving pathologic complete response (pCR) after NAC by spatio-temporal radiomic analysis of dynamic contrast-enhanced (DCE) MRI images acquired before treatment.

Study type:

Single-center, retrospective.


A total of 251 DCE-MRI pretreatment images of breast cancer patients.

Field strength/sequence:

1.5 T/3 T, T1-weighted DCE-MRI.


Tumor and peritumoral regions were segmented, and 348 radiomic features that quantify texture temporal variation, enhancement kinetics heterogeneity, and morphology were extracted. Based on subsets of features identified through forward selection, machine learning (ML) logistic regression models were trained separately with all images and stratifying on cancer molecular subtype and validated with leave-one-out cross-validation.

Statistical tests:

Feature significance was assessed using the Mann-Whitney U-test. Significance of the area under the receiver operating characteristics (ROC) curve (AUC) of the ML models was assessed using the associated 95% confidence interval (CI). Significance threshold was set to 0.05, adjusted with Bonferroni correction.


Nine features related to texture temporal variation and enhancement kinetics heterogeneity were significant in the discrimination of cases achieving pCR vs. non-pCR. The ML models achieved significant AUC of 0.707 (all cancers, n =251, 59 pCR), 0.824 (luminal A, n =107, 14 pCR), 0.823 (luminal B, n =47, 15 pCR), 0.844 (HER2 enriched, n =25, 11 pCR), 0.803 (triple negative, n =72, 19 pCR).

Data Conclusions:

Differences in imaging phenotypes were found between complete and noncomplete responders. Furthermore, ML models trained per cancer subtype achieved high performance in classifying pCR vs. non-pCR cases. They may, therefore, have potential to help stratify patients according to the level of response predicted before treatment, pending further validation with larger prospective cohorts.

Evidence Level:


Technical Efficacy:

Stage 4

Tomographic Imaging

Overige afdelingen Imaging