Evaluation
■ Case B is not valuable or useful
– Traffic data won’t be so close to training data in real life
– Most models use the same dataset split for training and validation
■ Classification reports
– Detailed analysis of all models and classifications performance label by label
– All metrics → 1 for frequently encountered outputs and → 0 for scarce outputs
27