19
The variable π§ describes the input value, which is the variables vector of the entries π
π
=
{π₯
0
, π₯
1
, β¦ , π₯
πβ1
} (for π number of features in the dataset) multiplied by weight values, that
will be tweaked as the model tries to predict π¦ with respect to π
π
.
π¦(π₯) = π
0
π₯
0
+ π
1
π₯
1
+ β― + π
πβ1
π₯
πβ1
= β π
π
π₯
π
πβ1
π=0
= π
π
π
Equation 2: output y as a function of the input values X
Thus, in the case of logistic regression, this abstract function becomes π¦ = π(π
π
π
π
). Through
the training of the model, the weight values (π
π‘
) are randomly initialized and then change so
that the loss function is minimized, and this sets the threshold for which π¦ = 1 or π¦ = 0.
Logistic regression is one of the simplest machine learning algorithms, so it doesnβt need many
conditions to generate satisfactory results and doesnβt require much CPU power usually. It also
doesnβt overfit as much as more complex algorithms and can easily update with new data.
Nevertheless, its simplicity hinders its performance on higher dimension datasets, and highly
correlated variables in a dataset should be avoided; also, it needs to train with larger datasets
without redundant records in them [15].
3.2. Decision Tree
The decision tree classifier is a tree-shaped algorithm that is commonly used for classification
applications.
Figure 2: decision tree classifier representation (source
[16]
)
The root node represents the beginning of the decision tree and includes the whole dataset. It
gets further divided (splitting) as the algorithm poses conditions to the dataset that create sub-
classes according to the outcome of each entry. Through the splitting process, branches are
created, as different classes of data follow different paths. The leaf nodes represent the