The field of Data Science has shown tremendous growth in the last decade. The vast amounts of that is getting generated, and the huge computational power that modern-day computers possess has enabled researchers, and scientist to achieve groundbreaking results in the field of Data Science, and Artificial Intelligence.
In Data Science, one of the key techniques to master is regression analysis. The first algorithms generally learned by people in Data Science are Linear Regression and Logistic Regression. Their popularity has prompted scientists, and experts alike to believe that these two are the only regression techniques in Machine Learning. Amongst all forms of regression analysis, analysts consider them as the most important.
However, in reality, that is not the case, as there are several regression techniques that could be applied in Machine Learning. Based on the application, and the usability of each regression techniques has its own importance. This article introduces you to seven types of regression techniques that you must know and extends your knowledge beyond the popular notion of Linear, and Logistic Regression.
What is Regression?
The relationships among variables are estimated by a statistical modeling technique known as regression analysis. It helps to understand how the variation in the independent variables brings about the changes in the dependent variable. It is mostly used in forecasting and prediction. Between the dependent and the independent variable, the inter-causal relationship could be also be inferred through regression analysis.
The process of data generation has a lot of impact on the regression analysis performance. In some cases, assumptions are made in regression analysis based on the data quality. In regression, the dependent variable could be numerical or discrete (binary, multinomial, or ordinal) in nature.
The types of regression depend on features, target variables, and the shape of the regression line.
We are going to discuss 7 regression techniques in machine learning that you should know.
Linear Regression is one of the oldest, and the simplest regression technique one would learn first. In this type of regression, the target variable is continuous in nature and maintains a linear relationship with the independent variables.
Linear Regression could be classified into Simple Linear Regression and Multiple Linear Regression. In Simple Linear Regression, there is only one predictor and one target variable while in Multiple Linear Regression there is more than one independent variables.
Represented by the equation: Dependent variable = Intercept + Slope * Independent Variable + Error
The goal in Linear Regression is to minimize the distance between the actual data points and the predicted data points i.e., minimize the residuals and find the best-fitted line.
Linear Regression in Python could be written as –
class sklearn.linear_model.LinearRegression(fit_intercept=True, normalize=False, copy_X=True, n_jobs=None)
Similar to Linear Regression, the dependent and the independent variable/variables must follow a linear relationship between them. However, in the case of Logistic Regression, the target or the dependent variable is discrete in nature. The discrete values could be Binary, Multinomial or Ordinal. In a Binary Classification problem, the log of odds in favour of the sigmoid function is used to make prediction whereas, in case of multi-class problems, the Softmax function gives better accuracy.
sklearn logistic regression could be written as –
class sklearn.linear_model.LogisticRegression(penalty='l2', dual=False, tol=0.0001, C=1.0, fit_intercept='1', intercept_scaling=1, class_weight=None, random_state=None, solver='warn', max_iter=100, multi_class='warn', verbose=0, warm_state=False, n_jobs=None)
In Polynomial Regression, there is a non-linear relationship between the dependent and the independent variables. The dependent variable is represented as the nth degree polynomial of the independent variable.
The least-squares method is used generally to fit the Polynomial regression models. Represented in sklearn as –
class sklearn.preprocessing.PolynomialFeatures(degree=2, interaction_only=False, include_bias=True)
An automatic procedure dependent on t-test, F-tests is used to choose the predictor variable in Stepwise Regression. As the name suggests, on each step, based on some pre-defined criteria, a variable is either added or subtracted from the set of relevant variables.
The Stepwise Regression techniques follow three approaches – Forward selection which involves repeatedly adding variables to check in its improvement which stops when no further improvements beyond an extent is possible. The Backward Elimination approach which involves deletion of variables one at a time until no more variables could be deleted without significant loss. Bidirectional elimination, which is a combination of the other two approaches.
Often in regression problems, the model becomes too complex and tends to overfit. Thus it is necessary to reduce the variance in the model and prevent the model from overfitting, Ridge Regression is one such technique which penalizes the size of the coefficients.
>>> from sklearn import linear_model >>> reg=linear_model.Ridge(alpha=.5) >>> reg.fit([[0,0],[0,0],[1,1]],[0,.1,1]) Ridge(alpha=0.5, copy_X=True, fit_intercept=True, max_iter=None, normalize=False, random_state=None, solver='auto', tol=0.001) >>> reg.coef_ array([0.34545455, 0.34545455]) >>> reg.intercept_ 0.13636...
The alpha controls the amount of shrinkage. The higher its value, the more robust the co-efficient becomes. It is also known as l2 regularization.
It is similar to Ridge regression in terms of its usage. It reduces the number of dependent variables, and highly used in compressed sensing.
>>> from sklearn import linear_model >>> reg=linear_model.Lasso(alpha=.1) >>> reg.fit([[0,0],[1,1]],[0,1]) Lasso(alpha=0.1, copy_X=True, fit_intercept=True, max_iter=1000, normalize=False, random_state=None, positive=False, precompute=False, selection='cyclic',tol=0.0001, warm_start=False) >>> reg.predict([[1,1]]) array([0.8])
The objective is to minimize the above function. The higher the penalty term, the coefficient could reduce to zero which is useful in feature selection. It is also known as L1 regularization.
In the estimation procedure, the regularization parameters are used.
The alpha is a random variable. Bayesian Regression adaptability to the data in hand is one of its key features. However, the time-consuming inference is a drawback.
Regression is one of the important techniques in Data Science and in this blog you got familiar with seven such techniques in regression analysis. The choice of choosing the right technique depends on the data and the conditions that need to be applied to it.
Enroll now for our Data science courses –