Page 18

3. Classification models analysis

In this section, the algorithms that were used for the classifications are going to be described
and briefly analysed, to better understand the way they work and what the advantages and
disadvantages of their use are.

3.1. Logistic Regression

Despite its name, logistic regression is a supervised classification algorithm, one that uses
regression to calculate the probability that a specific data entry (input – 𝑋

𝑖

), belongs to

category 𝑌

𝑗

. Describing it first as a binary classification problem, for an easier approach, will

help us understand the mechanics of this algorithm, while its use can easily be expanded for
multiclass classification problems, as multiclass classification (multinomial logistic regression)
takes place the same way as binary, in a one-against-all way; this means that the class
examined is classified as 1 whereas all other classes are considered 0 for the test (𝑔(𝑧)) of
each specific entry.

The function that logistic regression uses for the calculation of the probability is the sigmoid
function [14]:

𝑔(𝑧) =

1 + 𝑒

−𝑧

Equation 1: logistic/sigmoid function

Figure 1: sigmoid function graph (source

[14])

It is noticeable from the graph of the function that when 𝑧 → ∞, then 𝑔(𝑧) tends toward 1
and when 𝑧 → −∞ then 𝑔(𝑧) tends toward 0, which is why regression works well as a function.