G Algorithms

EN-ALR:

Elastic Net is a regularizations method by combining the penalty from LASSO ( $ℓ_{1}$ penalty) and Ridge ( $ℓ_{2}$ penalty) regression, which aims to avoid overfitting at the cost of increased bias. It enables automatic variable selection (a feature of LASSO) and avoids the limitation of LASSO regression at the same time.(Zou and Hastie 2005 a)

For logistic regression, the objective function for the penalized logistic regression uses the following log-likelihood: $min_{(β_{0}, β) \in R^{p + 1}} - [\frac{1}{N} \sum_{i = 1}^{N} y_{i} \cdot (β_{0} + x_{i}^{T} β) - \log (1 + e^{(β_{0} + x_{i}^{T} β)})] + λ [(1 - α) ‖ β ‖_{2}^{2} / 2 + α ‖ β ‖_{1}]$

Wherein $β_{0}$ and $β$ are coefficients in the generalized linear model, $y_{i}$ is the binary outcome for the ith individual, $x_{i}$ is the vector of covariates of the ith individual, $‖ β ‖_{1}$ is the $ℓ_{1}$ penalty on coefficients of $x_{i}$ , i.e. $β$ , and $‖ β ‖_{2}^{2}$ is the $ℓ_{2}$ penalty on $β$ . The two hyperparameters: $α$ and $λ$ control the penalty function, wherein $α$ bridges the gap between LASSO ( $α = 1$ ) and Ridge ( $α = 0$ ) and $λ$ controls the overall strength of the penalty.

XGBoost:

XGBoost refers to “Extreme Gradient Boosting”, which is a fast implementation of a gradient boosting algorithm that uses a gradient boosting framework (T. Chen and Guestrin 2016 b). It has been successfully used in many applications and becomes the winning solution for best predictive performance in numerous competitions.(Nielsen 2016) Hyperparameter tuning is the key to achieving accurate prediction, however, the cost is computation time. Therefore, to balance accuracy and efficiency, three hyperparameters: max_depth, eta, and nrounds of top importance were finely tuned while default values were used for other hyperparameters. The parameter max_depth refers to the maximum depth of a tree, increasing this value will result in a more complex model and more likely to overfit. eta stands for the step size shrinkage used in the update to prevent overfitting and nrounds controls the maximum number of iterations.

Ensemble:

This method combines multiple algorithms to generate a predicted risk with better predictive performance. Typically, the predicted risks were incorporated using weights which can be tuned as well. In this project, the weights for both algorithms (EN-ALR and XGBoost) were set at 0.5.

Hyperparameter tuning:

50 hyperparameter settings for EN-ALR and XGBoost respectively were randomly selected from hyperparameter spaces (EN-ALR: $α \in [0.05, 0.3]$ , and $λ \in [0.05, 0.3]$ ; XGBoost: max_depth $\in [5, 30]$ , eta $\in [0.1, 0.5]$ , and nrounds $\in [10, 150]$ ) which were obtained from a manually coarse tuning. Then the optimal hyperparameter setting for each algorithm was determined from the 50 settings based on a weighted sum of the AUC, AP, and sBrS.

A weighted sum of AUC, AP, and sBrS:

AUC, AP, and sBrS are the metrics used to evaluate models. To incorporate the three metrics, an equal-weighted sum of the three metrics was used to find the optimal hyperparameters. In addition, to avoid one metric dominate the rank of the weighted sum due to its magnitude in the 50 hyperparameter settings, AUC, AP, and sBrS were scaled to [0, 1] before being weighted, i.e.

$\begin{aligned} A U C_{scaled} & = [A U C - min (A U C)] \times \frac{1}{max (A U C) - min (A U C)} \\ A P_{scaled} & = [A P - min (A P)] \times \frac{1}{max (A P) - min (A P)} \\ S {BrS}_{scaled} & = [\sin S - min (s B r S)] \times \frac{1}{max (s B r S) - min (s B r S)} \end{aligned}$

Then the weighted sum of the three metrics can be expressed as:

$\frac{1}{3} (A U C_{s c a l e d} + A P_{s c a l i e d} + s B r S_{s c a l u e d})$

Modification of predicted risks

It should be noted that the predicted risks do not have a strictly monotone increasing relationship with ages, as the prediction models for different ages were developed separately. To avoid the occasional decrease in predicted risks, we force the predicted risks at age A to be equal to or greater than the maximum of predicted risks at ages $\leq$ A, i.e.

$\begin{aligned} R i s k_{A}^{m o d i f i e d} & = (r i s k_{A}, r i s k_{A -}) \\ r i s k_{A} & = predicted risk by age A \\ r i s k_{A -} & = predicted risk by ages younger than A \end{aligned}$

Predictors

Table G.1 listed the predictors used in modeling. For EN-ALR, chemotherapy agents that were rarely used in the study sample, such as busulfan, CCNU, chlorambucil, melphalan, thiotepa, idarubicin, mitoxantrone, were coded as binary Yes/No. In contrast to regression methods, XGBoost, as a tree-based machine learning algorithm, “prefers” continuous variables than categorical variables because it can split it at any point that minimizes the loss function. Therefore, doses of chemotherapy agents were used in developing the XGBoost model.

Table G.1: Predictors used in developing EN-ALR and XGBoost algorithms *Difference between predictors in two algorithms were shaded in blue and orange*
Variable Description	EN-ALR	XGBoost
Race (3 levels)	Categorical	Categorical
Age at Cancer Diagnosis	Continuous	Continuous
BMT Indicator	Binary	Binary
Cancer Diagnosis Type (8 levels)	Categorical	Categorical
Minimum Ovarian Radiation Dose	Continuous	Continuous
Radiation doses to pituitary	Continuous	Continuous
Total body irradiation dose	Continuous	Continuous
CED	Continuous	Continuous
BCNU	Continuous	Continuous
Busulfan	Binary	Continuous
CCNU	Binary	Continuous
Chlorambucil	Binary	Continuous
Cyclophosphamide	Continuous	Continuous
Ifosfamide	Continuous	Continuous
Melphalan	Binary	Continuous
Nitrogen Mustard	Continuous	Continuous
Procarbazine	Continuous	Continuous
Thiotepa	Binary	Continuous
Carboplatin	Continuous	Continuous
Cis_Platinum	Continuous	Continuous
Bleomycin	Continuous	Continuous
Daunorubicin	Continuous	Continuous
Doxorubicin	Continuous	Continuous
Idarubicin	Binary	Continuous
Methotrexate	Continuous	Continuous
Mitoxantrone	Binary	Continuous
VM 26	Continuous	Continuous
VP 16	Continuous	Continuous
Interaction: Age at Cancer Diagnosis and BMT	Continuous	NA
Interaction: Age at Cancer Diagnosis and Minimum Ovarian RT Dose	Continuous	NA