Data Science, Machine Learning, Tutorials

Machine Learning with Python: Nonlinear Regression

Posted on: 28 January 2023
Updated on: 26 April 2023
Written by: Pierian Training

Introduction

In this blog post we’ll be discussing nonlinear regression. In particular, we’ll analyze “polynomial regression”, which is one of the main processes to quickly create a non-linear regression model by expanding the existing data set. Nonlinear regression allows us to model relationships between variables that don’t have a clear linear relationship. This means models like basic linear regression or even multivariate regression won’t work effectively on these data sets. We’ll begin by discussing what nonlinear regression is, different use cases and applications, how it related to machine learning, and then we’ll dive into some examples using Python!

Your FREE Guide to Become a Data Scientist

Discover the path to becoming a data scientist with our comprehensive free guide! Unlock your potential in this in-demand field and access valuable resources to kickstart your journey.

Don’t wait, download now and transform your career!

What is Nonlinear Regression in Machine Learning?

Non-linear regression is a general description for statistical techniques used to model the relationship between a dependent variable and one or more independent variables. Unlike linear regression, which assumes a linear relationship between the independent features and dependent label, non-linear regression allows for more complex relationships to be modeled. In the real world, not every data set will follow the linear relationship of Linear Regression.

So, how can we create a model for a nonlinear relationship? There are many different methods! As discussed in “Introduction to Statistical Learning” (known as ISLR, which you can download for free here), nonlinear regression models can be classified into several different categories. These categories can include polynomial regression (our main example in this post), logarithmic regression, and exponential regression.

The most common form of nonlinear regression is polynomial regression, which allows us to expand the model to begin to model interaction terms and features to a higher power. We’ll be exploring polynomial regression later on in this post, but first, let’s explore some applications of nonlinear regression.

For example, imagine the following data plotted below:

Clearly, this data is not following a linear relationship between the X axis and Y axis, which means we can’t just draw a linear model. Instead we’ll need to devise a nonlinear model (for example a higher order polynomial regression) to fit the data.

Here we can see the results of a linear model on the data (degree of 1) vs. a higher order nonlinear polynomial regression (degree of 4):

As you can see above, a nonlinear regression can fit the data better. Let’s explore some real world applications of nonlinear regression.

Applications of Nonlinear Regression

Since so many real-world data sets won’t follow a linear relationship, there are many applications of nonlinear regression.

These applications include predictive modeling, time series forecasting, function approximation, and unraveling intricate relationships between variables. These models can be especially helpful when a linear model cannot adequately capture the patterns in the data and more complicated and flexible functions are needed. Numerous industries including banking, engineering, and medicine use nonlinear regression. It can be used with a variety of data types, including time-dependent, categorical, and continuous data. The objectives of nonlinear regression are to find the nonlinear function that best captures the relationship between the input and output variables and base accurate predictions on it. Let’s discuss some different types of nonlinear regression in the next section.

Types of Nonlinear Regression

There are many types of nonlinear regression. In python, numerous machine learning models can be used to predict a continuous label in a nonlinear fashion using regression. For example:

Neural networks: This model consists of interconnected layers of artificial neurons that allow neural networks to learn nonlinear relationships between inputs and outputs.
Decision trees and random forests: Tree-based methods such as decision trees and random forests can be used to model a nonlinear relationship since they use recursive partitioning to model out relationships.
Support Vector Regression (SVR): SVR is technically a linear model. However, you can change the kernel used in order to learn nonlinear decision boundaries, such as using a radial basis function (RBF) kernel.
K-Nearest Neighbors (KNN): This is a simple, non-parametric method used for both classification and regression. While usually not the first choice for a regression task, it can still be used as a “quick” nonlinear check.
Polynomial regression: This is a form of linear regression in which the relationship between the independent variable x and the dependent variable y is modeled as an nth degree polynomial. Polynomial regression is a good first choice when in need of a nonlinear model because it is a simple and interpretable model. It can model a wide range of nonlinear relationships by adding polynomial terms to the linear model. Additionally, it is easy to implement, as we shall explore in the implementation section of this post.

Nonlinear Regression Examples

There are many situations where one can use nonlinear regression.

In the world of finance, nonlinear regression can be used to model the nonlinear relationship between stock prices and company factors, such as fundamentals from company fillings or technical analysis. Nonlinear regression can help create models that can guide investors or analysts to help forecast future performance of a company or even try to explain historical performance of a stock price.

In the field of medical science, nonlinear regression is often used to model relationships between a patient’s physiological measurements against the likelihood of developing an illness, such as heart disease. You may even be familiar with some aspects of these models, such as using the results of blood tests or weight measurements to analyze how “at-risk” a patient is for a particular disease.

In civil engineering, nonlinear regression is used to model relationships between components of cement (such as limestone, sand, clay, additives, etc.) to its strength. A common case study is the ability to model a slump test result for cement using just data, and comparing these results to the real world slump test.

In the implementation below, we’ll use nonlinear regression on an advertising dataset by creating a set of polynomial features first and then applying a linear model, effectively creating a polynomial regression. We’ll analyze different channels of spend and see how they can be used to predict the output sales.

How to Implement Nonlinear Regression

Let’s perform a nonlinear regression using polynomial regression with Python and Scikit-Learn. First we’ll start with the imports and reading in our sample data, which is the “advertising.csv” file from Introduction to Statistical Learning.

				
					import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# Read in our Data Set
df = pd.read_csv("Advertising.csv")

Now using Scikit-Learn, we’ll import PolynomialFeatures, which will help us transform our original data set by adding polynomial features

We will go from the equation in the form (shown here as if we only had one x feature):

$$\hat{y} = \beta_0 + \beta_1x_1 + \epsilon $$

and create more features from the original x feature for some d degree of polynomial.

$$\hat{y} = \beta_0 + \beta_1x_1 + \beta_1x^2_1 + … + \beta_dx^d_1 + \epsilon$$

Then we can call the linear regression model on it, since in reality, we’re just treating these new polynomial features x^2, x^3, … x^d as new features. Obviously we need to be careful about choosing the correct value of d , the degree of the model. Our metric results on the test set will help us with this!

It is also worth noting we have multiple X features, not just a single one as in the formula above. So in reality, the PolynomialFeatures will also take interaction terms into account. For example, if an input sample is two dimensional and of the form [a, b], the degree-2 polynomial features are [1, a, b, a^2, ab, b^2].

				
					from sklearn.preprocessing import PolynomialFeatures
polynomial_converter = PolynomialFeatures(degree=2,include_bias=False)
# Converter "fits" to data, in this case, reads in every X column
# Then it "transforms" and ouputs the new polynomial data
poly_features = polynomial_converter.fit_transform(X)
poly_features.shape

Now that we have the polynomial features, we can perform our train test split with Scikit-Learn:

				
					from sklearn.model_selection import train_test_split
# random_state: 
# https://stackoverflow.com/questions/28064634/random-state-pseudo-random-number-in-scikit-learn
X_train, X_test, y_train, y_test = train_test_split(poly_features, y, test_size=0.3, random_state=101)

Next we fit a linear regression model on the training data:

				
					from sklearn.linear_model import LinearRegression
model = LinearRegression(fit_intercept=True)
model.fit(X_train,y_train)

We want to fairly evaluate our model, so we get performance metrics on the test set (data the model has never seen before).

				
					test_predictions = model.predict(X_test)
from sklearn.metrics import mean_absolute_error,mean_squared_error
MAE = mean_absolute_error(y_test,test_predictions)
MSE = mean_squared_error(y_test,test_predictions)
RMSE = np.sqrt(MSE)

Now depending on your data set and results, you may not be satisfied with the RMSE performance shown above. In which case, we can begin to edit the hyperparameters of the model, the most important one here being the degree of the polynomial.

It is now up to us to possibly go back and adjust our model and parameters. Let’s explore higher order Polynomials in a loop and plot out their error. This will nicely lead us into a discussion on Overfitting.

Let’s use a for loop to do the following:

Create different order polynomial X data
Split that polynomial data for train/test
Fit on the training data
Report back the metrics on both the train and test results
Plot these results and explore overfitting

				
					# TRAINING ERROR PER DEGREE
train_rmse_errors = []
# TEST ERROR PER DEGREE
test_rmse_errors = []

for d in range(1,10):
    
    # CREATE POLY DATA SET FOR DEGREE "d"
    polynomial_converter = PolynomialFeatures(degree=d,include_bias=False)
    poly_features = polynomial_converter.fit_transform(X)
    
    # SPLIT THIS NEW POLY DATA SET
    X_train, X_test, y_train, y_test = train_test_split(poly_features, y, test_size=0.3, random_state=101)
    
    # TRAIN ON THIS NEW POLY SET
    model = LinearRegression(fit_intercept=True)
    model.fit(X_train,y_train)
    
    # PREDICT ON BOTH TRAIN AND TEST
    train_pred = model.predict(X_train)
    test_pred = model.predict(X_test)
    
    # Calculate Errors
    
    # Errors on Train Set
    train_RMSE = np.sqrt(mean_squared_error(y_train,train_pred))
    
    # Errors on Test Set
    test_RMSE = np.sqrt(mean_squared_error(y_test,test_pred))

    # Append errors to lists for plotting later
    
   
    train_rmse_errors.append(train_RMSE)
    test_rmse_errors.append(test_RMSE)

Now we can plot the results with Matplotlib:

				
					plt.plot(range(1,6),train_rmse_errors[:5],label='TRAIN')
plt.plot(range(1,6),test_rmse_errors[:5],label='TEST')
plt.xlabel("Polynomial Complexity")
plt.ylabel("RMSE")
plt.legend()

Here we can see that a polynomial degree of 3 probably makes the most sense, since the test error being to overfit at or after polynomial 4. This means we can simply retrain our final model as a 3 degree polynomial, then convert and save for future use with the following Python code:

				
					# Based on our chart, could have also been degree=4, but 
# it is better to be on the safe side of complexity
final_poly_converter = PolynomialFeatures(degree=3,include_bias=False)
final_model = LinearRegression()
final_model.fit(final_poly_converter.fit_transform(X),y)

from joblib import dump, load
dump(final_model, 'sales_poly_model.joblib') 
dump(final_poly_converter,'poly_converter.joblib')


loaded_poly = load('poly_converter.joblib')
loaded_model = load('sales_poly_model.joblib')
campaign = [[149,22,12]]
campaign_poly = loaded_poly.transform(campaign)
campaign_poly

final_model.predict(campaign_poly)

Pros and Cons of Nonlinear Regression

While we can see that nonlinear regression is easily implemented in Scikit-Learn, let’s take a look at some of the pros and cons of nonlinear regression. We’ll start off with the pros:

Nonlinear Regression Pros:

It allows for more flexibility in modeling the relationship between the dependent and independent variables, as it can capture non-linear and non-additive relationships. This is particularly useful when the relationship between the variables is complex and cannot be accurately represented by a linear model.
It may take into account a wider variety of underlying distributions and better represent non-normal data, improving predictions, and providing new insights into the relationship between variables.
In general, nonlinear regression has the capabilities to deal with more complex relationships between your features and your label.

It’s not a perfect solution though! There are some disadvantages, let’s discuss some of the cons:

Nonlinear Regression Cons:

Typically, it will be more computationally intensive and time consuming than just linear regression, due to the additional workload of the higher order models.
Depending on the data and use case, it can sometimes be difficult to directly interpret a nonlinear model’s results. For example, the intuition behind a higher order feature is hard to comprehend for most people.
In order to perform nonlinear regression, you will have to select an appropriate model from a variety of options. Therefore, you’ll need to have a good understanding of the underlying relationships of the models. A great resource to understand these models is to take one of our Machine Learning with Python classes!

Difference Between Nonlinear Regression and Linear Regression

As we’ve seen above, nonlinear regression models will allow us to model the relationship between dependent and independent variables that don’t have a linear relationship. Keep in mind, however, that nonlinear regression will require more data to produce accurate models and will typically require us to transform the data, as we did in our implementation example through the polynomial toolkit that Scikit-Learn provides.

Summary

In summary, we’ve explored nonlinear regression and a variety of use cases, applications, and algorithm examples. We also walked you through an example implementation of nonlinear regression via polynomial regression with Python and Scikit-Learn. Hopefully you can use this knowledge about nonlinear regression with your own data sets and machine learning tasks using our python examples!

Pierian Training

Pierian Training is a leading provider of high-quality technology training, with a focus on data science and cloud computing. Pierian Training offers live instructor-led training, self-paced online video courses, and private group and cohort training programs to support enterprises looking to upskill their employees.