Abstract

The number of childhood cancer survivors has dramatically increased in the past few decades due to advances in cancer treatment, shifting the priority from clinical treatment to improving long-term survivors’ quality of life. One late effect that greatly impacts female survivors is premature ovarian insufficiency (POI). It is estimated that about one in seven female survivors develops POI before age 40. POI dramatically shortens the reproductive age interval and causes infertility. To preserve the function, some fertility preservation procedures for childhood cancer survivors are now available. However, without knowing the risk of future POI, it is challenging to make fertility preservation decisions. This study aimed to build a reliable prognostic model to predict the risk of developing POI at prespecified ages in female cancer survivors to inform decision-making on fertility preservation.

We included 7,891 female survivors who are participants in the Childhood Cancer Survivor Study. The multiple imputation method was employed to deal with the missing data and an inverse probability censoring weight was assigned to each individual to account for the censoring. Elastic-Net panelized logistic regression, XGBoost, and an “Ensemble” method were used to predict the risk of experiencing POI at prespecified ages. The model performance was evaluated by nested cross-validation.

The results showed that the “Ensemble” method performed the best with AUCs (areas under the receiver operating characteristic curves) around 0.8 and AP (average positive predictive value) ranging from 0.469 to 0.595 for prespecified ages ranging from 21 to 39. The calibration curves indicated good alignment between the estimated risks of developing POI and observed status for prespecified ages less than 28. The developed “Ensemble” algorithm can be further crafted into a user-friendly clinical tool which can provide clinicians and patients quantitative information when discussing fertility preservation.