I have a data frame which is structured like this one:

dd <- data.frame(round = c("round1", "round2", "round1", "round2"),var1 = c(22, 11, 22, 11),var2 = c(33, 44, 33, 44),nam = c("foo", "foo", "bar", "bar"),val = runif(4))round var1 var2 nam val1 round1 22 33 foo 0.329957292 round2 11 44 foo 0.892150383 round1 22 33 bar 0.092135264 round2 11 44 bar 0.82644723

From this I would like to obtain a data frame with two lines, one for each value of nam, and variables var1_round1, var1_round2, var2_round1, var2_round2, val_round1, val_round2. I would really like to find a dplyr solution to this.

 nam var1_round1 var1_round2 var2_round1 var2_round2 val_round1 val_round21 foo 22 11 33 44 0.32995729 0.89215042 bar 22 11 33 44 0.09213526 0.8264472

The closest thing I can think of would be to use spread() in some creative way but I can't seem to figure it out.

1

Best Answer


We can use tidyr/dplyr to do this. We gather the dataset to 'long' format, unite the 'variable' and 'round' to create 'var' and then spread to 'wide' format.

library(dplyr)library(tidyr)gather(dd, variable, value, var1, var2, val) %>%unite(var, variable, round) %>% spread(var, value)# nam val_round1 val_round2 var1_round1 var1_round2 var2_round1 var2_round2#1 bar 0.7187271 0.6022287 22 11 33 44#2 foo 0.2672339 0.7199101 22 11 33 44

NOTE: The 'val' are different as the OP didn't set a seed for runif