How to turn your dog’s nap time into a regularized linear model

While a model that generates predictions that arent too far off from the true value has low variance.

Since its already going beyond the real patterns of the data, the model isoverfittingthe data.

This is where regularization comes in!

How to turn your dog’s nap time into a regularized linear model

In practice, regularization tunes the values of the coefficients.

Thats why regularization techniques are also calledshrinkagetechniques.

Regularization techniques are also calledshrinkage techniques, because they shrink the value of the coefficients.

Image for post

Some coefficients can be shrunk to zero.

Even though its commonly used for linear models, regularization can also be applied tonon-linear models.

Wouldnt it interesting if you could predict how long your dog will nap tomorrow?

Image for post

Altogether they affect your dogs energy levels and, consequently, how much they will nap during the day.

Since you want to predict your dogs nap duration, you start thinking about it as amultivariate linear model.

[Read:How do you build a pet-friendly gadget?

Image for post

The remaining betas are the unknowncoefficientswhich, along with the intercept, are the missing pieces of the model.

Additionally, you think theres a linear relationship between the features and the target.

For how long will your dog nap tomorrow?

With the general idea of the model in your mind, you collected data for a few days.

Now you have real observations of the features and the target of your model.

But there are still a few critical pieces missing, the coefficient values and the intercept.

One of the most popular methods to find the coefficients of a linear model isOrdinary Least Squares.

With the residual sum of squares, not all residuals are treated equally.

In python, you could useScikitLearnto fit a linear model to the data using Ordinary Least Squares.

Image for post

In this case, the test dataset sets aside 20% of the original dataset at random.

After fitting a linear model to the training set, you might check its characteristics.

The coefficients and the intercept are the last pieces you needed to define your model and make predictions.

Image for post

It shows how much of the variability in the target is explained by the features[1].

Lets use regularization to reduce the variance while trying to keep bias a low as possible.

You do that by applying a penalty to the Residual Sum of Squares[1].

Image for post

The tuning parameter controls the impact of theshrinkage penaltyin the residual sum of squares.

If all features have coefficient zero, the target will be equal to the value of the intercept.

Even though itshrinkseach model coefficient in the same proportion, Ridge Regression will never actuallyshrinkthem to zero.

Image for post

The very aspect that makes this regularization more stable, is also one of its disadvantages.

you might fit a model with Ridge Regression by running the following code.

However, the complexity and interpretability of the model remained the same.

Image for post

You still have four features that impact the duration of your dogs daily nap.

Lets turn to Lasso and see how it performs.

Thats why Lasso is also referred to as L1 regularization.

Image for post

Lasso uses a technique called soft-thresholding[1].

Again, with an arbitrarylambdaof 0.5, you’re free to fit lasso to the data.

This graph reinforces the fact that Ridge regression is a much more stable technique than Lasso.

Image for post

As for Lasso, theres a bit more variation.

Heres how you could create these plots in Python.

We can verify this by fitting the data again, now with more targeted values.

Image for post

Using Lasso you ended reducing significantly both variance and bias.

With Ridge Regression the model maintains all features and, aslambdaincreases, overall bias and variance gets lower.

When to use Lasso vs Ridge?

Use Lasso when …

In this case, Lasso will pick on the dominant features andshrinkthe coefficients of the other features to zero.

Use Ridge Regression when … At the end of the day, its all about trade-offs!

Image for post

References

[1] Gareth James, Daniela Witten, Trevor Hastie, and Robert Tibshirani.

Regression shrinkage and selection via the lasso.

Soc B., 58, 267288

This article was originally published byCarolina BentoonTowards Data Science.

Image for post

you might read the piecehere.

Also tagged with

Image for post

For how long will your dog nap tomorrow?#

When to use Lasso vs Ridge?#

Use Lasso when …#

Use Ridge Regression when … At the end of the day, its all about trade-offs!#

References#

Also tagged with#

For how long will your dog nap tomorrow?

When to use Lasso vs Ridge?

Use Lasso when …

Use Ridge Regression when … At the end of the day, its all about trade-offs!

References

Also tagged with