Linear Regression - Part 1 ~ Python and Machine Learning Blog

Introduction

Linear Regression is supervised learning techinique to model the quantitative data. The model fits a line that is closest to all observation points. The basic assumption here is that functional form is the line and it is possible to fit the line that will be closest to all observation. Because of its simplicity, LR serves as good starting point to provide benchmark on which more complex models can be built.

Following figure shows a 2-dimensional X-Y plot and the corresponding line that fits between points so that the line is closest to all point.

Regression Line

Basic maths on how to draw a line in two dimension.

Let's start with basic on what parameters are required to draw a line in two dimension plane. From the following figure, it is easy to figure out that intercept (β0) and slope (β1) parameters are required to draw the line.

Intercept - Intercept define the Y value when input X =0.

Slope - Slope defines the angle by which line can be rotated across the intercept.

Intercept (β0) and Slope on a line (β1).

If we change either of them, the position or orientation of the line will change resulting in new line.

As shown in the following 3-d figure, the understanding of 2-dimension can be extended to higher dimension as well to draw a line. For example in 3 dimension, with one output variable (Z) and two input variable ( X and Y), three parameters, one intercept (β0) and two slopes (β1, β2), are required to draw a line.

Intercept (β0) and Slope(β1, β2 ) on a 3-d line .

Basic Maths for Linear Regression.

Mathematically, linear regression equation can be written as one of the following two ways depending on number of input variable.

Simple Regression

Simple regression has one input variable (Predictor) and one output variable (Response variable). In the following equation, X defines the inputs feature and Y defines the output variable.

Multiple Regression

Multiple regression has more than one input variable (Predictor) and one output variable (Response variable). In the following equation X = (x1, x2.. xn) defines the input variable and Y defines the output variable.

The simple and multiple regression technique allows us to estimate the coefficient (β1, β2...βn) of the line depending on the sample dataset provided. This allows us to estimate the line so that it is nearest to all sample point collectively.

How to estimate the co-efficient ?

Coefficient estimates are done in two steps:

Estimate the error - This define the error between actual and predicted value for each observation.
Minimize the error - There are various ways to minimize the error but LEAST SQUARE is most popular among them. As the names implies, least square minimizes the error after taking the square of prediction error for every observation.

Step-1 : Estimate the error

Estimation error can be calculated by the difference between actual value and predicted value. From the following figure, we can see that error (e9) is defined by difference between actual-value and predicted value for observation#9.

Error estimation

Based on the above understanding, it is possible to calculate the the error for each individual observation in the model.
For a sample having 'n' observation error for individual observation will be = [e1 , e2. e3, e4, e5, ...... , en]

Step-2: Minimize the error

Now that we found observation error, the next step is to minimize the total error in the model. For this first residual sum of errors (RSS) is calculated and then total error (RSS) is minimized to get the coefficient of intercept and slope. This is also done in two steps:

Calculate Residual Sum of Squares(RSS) and differentiate - Calculate total sum of square of errors. Note that errors are squared so that we get absolute errors and total errors equation can be differentiated to get the coefficient for intercept and slope for least error line.

RSS = e1^2+e2^2+e3^2+⋯+ en^2

RSS = (y1-β0-β1x1)2+ (y2-β0-β2x2)2 +…. + (yn-β0-βnxn)2

The coefficients are calculated as below :

where

How to interpret the co-efficient?

Simple Regression

Suppose we want to predict Y(=Sales) of the product based on the input X( = TV budget ) and the co-efficient for for β0 and β1 are calculated as below :

β1 = 0.0475 - Average increase of Y=Sales associated with one unit increase in X=TV Budget. We can conclude that for additional increase in TV budget of 1000 unit, Sales will increase by 47.5

β0= 7.03 - Expected value of Y=Sales when X = TV Budget is equal to 0. We can conclude that If TV budget is 0, then default sales will be 7.03 unit.

Multiple Regression

Suppose we want to predict Y(=Sales) of the product based on the three input X1(TV budget ), X2(Radio Budget), and X3(Newspaper Budget) and the corresponding co-efficient are β0 , β1, β2 and β3 are calculated as below :

β1 = 0.046 - Average increase of Y=Sales associated with one unit increase in X=TV Budget, provided that there is no increase in budget of other predictors (Radio and Newspaper). We can conclude that for additional increase in TV budget of 1000 unit, provided that all other budgets are constant, Sales will increase by 46 units.
β0= 0.0475 - Expected value of Y=Sales when there has been no budgetary expenditure on TV, Radio and Newspaper. In this case we can conclude that default sales will be 2.939 unit.

Summary

In this blog i presented you the basic concept on Regression and how to calculate the LEAST ERROR line by estimating the corresponding co-efficient for intercept and slope. In the next blog, i will explain about how to calculate errors in the intercept and slope coefficient and Regression model. I will also present a use case to show how to different the various parameters to evaluate the model.

Python and Machine Learning Blog

Blogger templates

Saturday, 2 May 2020