Ridge regression differs from OLS regression very slightly. Mathematically, OLS regression uses the formula
where ridge regression uses the formula
I wanted to use ridge regression to avoid multicolinearity, but got back very strange results which were substantially worse than simply using regress(). In matlab, to call the function ridge, one must input an X, a Y, and a value for k. Theoretically, if k is set to zero, these equations should be the same; but when both are called back to back in my code, using the same values of X and Y, I receive two very different matrices for B (shown below). Can someone explain why this would happen?
b_ridge = ridge(Y_current,X, 0)12.45259.00990.2808-1.5426-1.1107b_regress = regress(Y_current,X)3.55860.88050.1670-0.3934-0.8526
Best Answer
Acording to ridge
documentation:
The results are computed after centering and scaling the
x
columns so they have mean 0 and standard deviation 1.
Here's an example using column vectors:
>> x = randn(5,1);>> y = randn(5,1);>> ridge(y, x, 0)ans =-0.045681220595243>> regress(y, x)ans =-0.028738686366027>> regress(y, (x-mean(x))/std(x))ans =-0.045681220595243