Deviant performances
■ Training and test sets are very different
– Distribution and types of labels
– Difficulty levels
– Services and flags (highly correlated with many other features and with output)
■ Check for overfitting:
– Case A: training set 𝐾𝐷𝐷𝑇𝑟𝑎𝑖𝑛 + and test set (validation) 𝐾𝐷𝐷𝑇𝑒𝑠𝑡 +
– Case B: training and test set are part of 𝐾𝐷𝐷𝑇𝑟𝑎𝑖𝑛 + (using .train_test_split)
23