Good morning,
I am trying to realize the white test on my linear model with R. I don't know how to write the R codes to realize the White Test.
Price : house price, in millions dollars
Bdrms : number of bedrooms
Lotsize : size of lot in square feet
Sqrft : size of house in square feet
The linear model is the following :
#Linear ModelLinearModel.1 <- lm(PRICE ~ LOTSIZE + LOTSIZE^2 + SQRFT + BDRMS, data=Dataset)summary(LinearModel.1)#Breusch-Pagan Testlibrary(lmtest)bptest(LinearModel.1, varformula = NULL, studentize = TRUE, data = Dataset)#White Test?????????
Thanks for your answerKind regards,
Best Answer
m <- LinearModel.1data <- Datasetu2 <- m$residuals^2y <- fitted(m)Ru2<- summary(lm(u2 ~ y + I(y^2)))$r.squaredLM <- nrow(data)*Ru2p.value <- 1-pchisq(LM, 2)p.value
if p.value < 0.05, then Ho (there is no heteroskedasticity) is rejected at 5% significance level and you conclude that there is heteroskedasticity in your model
The White Test has been implemented in the package "bstats". After installing and loading this package, a White Test is performed on a linear model object by simply typing
white.test(lm0)
See this page for a description and an example.
White's Test is now implemented in the white_lm
function of skedastic
package; seehttps://www.rdocumentation.org/packages/skedastic/versions/1.0.0/topics/white_lm
Test can be implemente using the bptest
function from the lmtest
package, as follows:
reg <- lm(y~x1+x2) # storing regressionbptest(reg, ~ poly(fitted(reg) , 2))
Notice the above implements the special form of the test, which uses the fitted value of y
and its squared value (hence the option poly(fitted(reg) , 2)
in bptest
) as regressors in the second stage equation. The normal form uses all regressors, their squared values and interactions as elements in the second stage regression, at the cost of losing degrees of freedom. To implement that form you need to replace ~ poly(fitted(reg) , 2)
with something like ~ x1*x2 + x1*x3 ...
. If you have a lot of regressors, it might be easier to use another package.
I have written a function to reproduce two methods in R
for White's test in hendry2007econometric.
test_white(mod, dat, resi2 ~ x1 + x2 + I(x1^2) + I(x2^2), 3)
where the squared residuals are regressed on all regressors and their squares. The degree of freedom is the number of parameters (let's say k
).
test_white(mod, dat, resi2 ~ x1 + x2 + I(x1^2) + I(x2^2) + I(x1 * x2), 6)
where the squared residuals are regressed on all regressors, their squares, and their cross products. The degree of freedom is k * (k + 1) / 2
.
test_white <- function(mod, dat, f, df1, prob){if(missing(prob)){prob = 0.05}dat %<>% mutate(resi2 = mod$residuals^2)stat <-lm(f, data = dat) %>%{summary(.)$r.squared} %>%{. * nrow(dat)}p_value <- stat %>%{1 - pchisq(., df1)}results <- tibble(whi = "White", stat = stat, df1 = df1, df2 = nrow(dat) - df1, p_value = p_value,prob = prob, if_accept = {p_value <= prob}, if_pass = {p_value >= prob})return(results)}
I think the way @Mike K does is alright, which is actually to test if an lm
model based on the scale-location plot is significant. The scale-location plot refers to when you plot diagnostics for an lm
object by plot.lm(model, which = 3)
.