Ridge regression differs from OLS regression very slightly. Mathematically, OLS regression uses the formula

enter image description here

where ridge regression uses the formula

enter image description here

I wanted to use ridge regression to avoid multicolinearity, but got back very strange results which were substantially worse than simply using regress(). In matlab, to call the function ridge, one must input an X, a Y, and a value for k. Theoretically, if k is set to zero, these equations should be the same; but when both are called back to back in my code, using the same values of X and Y, I receive two very different matrices for B (shown below). Can someone explain why this would happen?

b_ridge = ridge(Y_current,X, 0)12.45259.00990.2808-1.5426-1.1107b_regress = regress(Y_current,X)3.55860.88050.1670-0.3934-0.8526
1

Best Answer


Acording to ridge documentation:

The results are computed after centering and scaling the x columns so they have mean 0 and standard deviation 1.

Here's an example using column vectors:

>> x = randn(5,1);>> y = randn(5,1);>> ridge(y, x, 0)ans =-0.045681220595243>> regress(y, x)ans =-0.028738686366027>> regress(y, (x-mean(x))/std(x))ans =-0.045681220595243