TY - GEN
T1 - Early Prediction of Sepsis Using Gradient Boosting Decision Trees with Optimal Sample Weighting
AU - Hammoud, Ibrahim
AU - Ramakrishnan, I. V.
AU - Henry, Mark
N1 - Publisher Copyright:
© 2019 Creative Commons.
PY - 2019/9
Y1 - 2019/9
N2 - In this work, we describe our early sepsis prediction model for the PhysioNet/Computing in Cardiology Challenge 2019. We prove that maximizing a general family of utility functions (of which the challenge utility function is a special case) is equivalent to minimizing a weighted 0-1 loss. We then utilize this fact to train an ensemble of gradient boosting decision trees using a weighted binary cross-entropy loss.Our model takes the time-series nature of the data into account by using a fixed size window of all measurements within the last 20 hours as a feature vector. Data were imputed in a way that gives the same information to the model as present to healthcare professionals in real-time. We tune the model hyper-parameters using 5-fold cross-validation. The model performance was measured on each evaluation set using the threshold that gives the maximum utility on the training set. Our best model achieves an official normalized utility score of 0.332 on the final full test set of the challenge (Team name: SBU, rank: 6th/78).
AB - In this work, we describe our early sepsis prediction model for the PhysioNet/Computing in Cardiology Challenge 2019. We prove that maximizing a general family of utility functions (of which the challenge utility function is a special case) is equivalent to minimizing a weighted 0-1 loss. We then utilize this fact to train an ensemble of gradient boosting decision trees using a weighted binary cross-entropy loss.Our model takes the time-series nature of the data into account by using a fixed size window of all measurements within the last 20 hours as a feature vector. Data were imputed in a way that gives the same information to the model as present to healthcare professionals in real-time. We tune the model hyper-parameters using 5-fold cross-validation. The model performance was measured on each evaluation set using the threshold that gives the maximum utility on the training set. Our best model achieves an official normalized utility score of 0.332 on the final full test set of the challenge (Team name: SBU, rank: 6th/78).
UR - https://www.scopus.com/pages/publications/85081123498
U2 - 10.23919/CinC49843.2019.9005700
DO - 10.23919/CinC49843.2019.9005700
M3 - Conference contribution
AN - SCOPUS:85081123498
T3 - Computing in Cardiology
BT - 2019 Computing in Cardiology, CinC 2019
PB - IEEE Computer Society
T2 - 2019 Computing in Cardiology, CinC 2019
Y2 - 8 September 2019 through 11 September 2019
ER -