Linear to Logistic Regression — Classification (Part — I)
Linear Regression identify the relationship between dependent variable (y) and one or more independent variables (xi). In this relationship, the dependent variable is continuous in nature i.e. it can be any values like 2.2, 5, 9.1, 25 etc.
Linear Regression will do well for problems like future price of houses, vehicle mileage etc, but what about problems which have 2 or more classes, examples — email spam or not-spam, Covid positive or negative.
Can linear regression model help to find the class? Let’s check.
Refer to the Regression equation (Eq. 1) below, with optimized weights and for given input value of x, it will calculate the best value of y. The optimized weights best fit the linear equation between the data, refer figure below. In most cases data is scattered like this.
Now the question is how the data for classification cases looks like?
Refer Fig. (2), binary classification case with data value either 1.0 or 0.0.
If we will try to fit the linear equation over this classification data, the line will pass through data points as shown in Fig. (3). On X-axis, independent data as input and on Y-axis, dependent classification data points (Covid positive or negative). Let’s assume, if Y >= 0.5 (threshold), output will be Covid positive (Y=1) otherwise Covid negative (Y=0). It looks fine as per the current situation.
Let’s add new data point with Y = 1 and on the extreme left i.e. near to Y-axis. Based on this new data set, new optimised weights will be calculated. This will change the position and slope of line and subsequently changes the threshold. For example, threshold value changes from 0.5 to 0.4, based on addition of new datapoints.
Key outcome of the above discussion
“Addition of new datapoints results in redefining of new threshold value”
Solution to avoid redefining threshold value is to use some function which will give probabilities (0–1) of outcome.
Next question is which function is best for this purpose?
Sigmoid Function is best answer to this. Sigmoid is non-linear function having S shape graph with values ranges between 0 to 1, refer Fig (4). Mathematical expression is as under
Where x is input variable and σ is output variable.
Let’s check the function value on various inputs, refer the Table-1 below
Final form of classification equation comes out by combining Eq. (1) and Eq. (2)
Where w0, w1 are the weights, x is input and p is probability between (0–1).
References
[2] Binary Classification data graph
[3] Regression with classification data points
[4] Sigmoid graph