7
Table of Contents
Acknowledgments ................................................................................................................. 2
Abstract ................................................................................................................................ 3
Περίληψη .............................................................................................................................. 3
Εκτεταμένη περίληψη ........................................................................................................... 4
Table of Contents .................................................................................................................. 7
Figures index: ........................................................................................................................ 9
Tables index: ....................................................................................................................... 10
Equations index: .................................................................................................................. 12
1. Introduction .................................................................................................................... 13
2. An overview of machine learning and anomaly detection research .................................. 14
2.1. Supervised machine learning .................................................................................... 15
2.2. Anomaly detection with machine learning: related research ..................................... 16
2.3. The advantages of the NSL-KDD dataset .................................................................... 17
3. Classification models analysis........................................................................................... 18
3.1. Logistic Regression .................................................................................................... 18
3.2. Decision Tree ............................................................................................................ 19
3.3. K – Nearest Neighbours ............................................................................................ 20
3.4. Gaussian Naïve Bayes ................................................................................................ 21
3.5. Multi-Layer Perceptron ............................................................................................. 22
4. Characteristics and pre-processing of the NSL-KDD dataset.............................................. 24
4.1. The attack labels (traffic type) ................................................................................... 24
4.2. The features of NSL-KDD ........................................................................................... 29
4.2.1. Categorical features ........................................................................................... 29
4.3. Pre-processing of the NSL-KDD dataset ..................................................................... 34
4.3.1. One-hot encoding .............................................................................................. 37
4.3.2. Correlation ......................................................................................................... 38
4.3.3. X and Y components, scaling the data ................................................................. 43
5. Evaluation and results ...................................................................................................... 46
5.1. Interpreting the Classification Reports ...................................................................... 48
5.2. Evaluation and results compared to relevant research .............................................. 50