I have a list of about thousand data frames and I'm applying a function to all these data frames with lapply. However, there seems to be eight list elements (data frames) which generate this warning message:
In mapply(FUN = f, ..., SIMPLIFY = FALSE) :longer argument not a multiple of length of shorter
So basically I'd like to just track down the elements which generate the error, so I can make the necessary fixes to them, but don't know how. So far been just going through the data frames individually by applying a function to them and seeing whether or not the specific data frame generates the error (like this testdata <- my_function(df[[1]], "X")
) but as expected, it's taking forever to go through lol.
Best Answer
I typically find purrr::quietly()
to be helpful when I get warnings (for errors, possibly()
is better). This generates a list with the following elements per iteration:
- result
- output
- warnings
- messages
Here is a reprex on how to identify the dataframes which gives you problems:
library(purrr)# Replace "log" with your function, and the vector with your list of dataframesres <- c(10, 20, -1) %>% map(quietly(log)) # note the quietly()# This gives you the first index where you got a warningres %>% detect_index(~length(.x$warnings) > 0)#> [1] 3# With this map you can find the warning of all dataframes, also those who don't # have any. The index will tell you where all problems areres %>% map(~.x$warnings)#> [[1]]#> character(0)#> #> [[2]]#> character(0)#> #> [[3]]#> [1] "NaNs produced"# With keep you can see all results from iterations with warningsres %>% keep(~length(.x$warnings) > 0)#> [[1]]#> [[1]]$result#> [1] NaN#> #> [[1]]$output#> [1] ""#> #> [[1]]$warnings#> [1] "NaNs produced"#> #> [[1]]$messages#> character(0)
Created on 2022-04-05 by the reprex package (v2.0.1)
You could try it with possibly()
. E.g. if some values are not numeric, we cannot divide a number by it (hence an error). For more infos on error handling with purrr
see https://aosmith.rbind.io/2020/08/31/handling-errors/
library(purrr)library(dplyr)my_function <- function(x) { 20/x}find_error = possibly(.f = my_function, otherwise = NULL)df <- list(df1 = tibble(values =c(1,2,3)),df2 = tibble(values = c("1","2","3")))df %>% map(find_error) %>% keep(~is.null(.x))