Modul:   STA671  Kolloquium über anwendungsorientierte Statistik

Supervised learning with missing values

Vortrag von Prof. Dr. Julie Josse

Datum: 26.02.20  Zeit: 16.15 - 17.15  Raum: ETH HG G 19.1

Abstract In many application settings, the data have missing features which make data analysis challenging. An abundant literature addresses missing data in an inferential framework: estimating parameters and their variance from incomplete tables. Here, we consider supervised-​learning settings: predicting a target when missing values appear in both training and testing data. I will present the consistency of different approaches in prediction for general and linear models, including the use of very simple imputation methods. We analyze further decision trees. These can naturally tackle empirical risk minimization with missing values, due to their ability to handle the half-​discrete nature of incomplete variables Reference papers: https://arxiv.org/abs/1902.06931 http://juliejosse.com/wp-​content/uploads/2020/01/aistats-​2.pdf