I am trying to reshape my data from long to wide format.
My dataset is 5million rows, but here is the general format::
ID fact_id fact_value
1 1 a
1 2 b
1 3 a
1 4 a
1 5 a
1 6 a
2 1 b
2 2 a
2 3 b
3 4 c
4 1 a
4 2 b
4 3 c
ID is the participant ID number. fact_id corresponds to a question in a survey. fact_value corresponds to the participant's answer.
I am trying to pivot the data to a wide format, so each "ID" has it's own row, and the fact_id number is the new column value. Example of what I want:
ID 1 2 3 4 5 6
1 a b a a a a
2 b a b NA NA NA
3 NA NA NA c NA NA
4 a b c NA NA NA
I ran this code:
widedata <- longdata %>%
reshape(idvar = "ID", v.names = "fact_value", timevar = "fact_id", direction = "wide")
and my output is weird. Here is an image (my ID column is actually md5id): Each ID value does have it's own row. However, the fact_id numbers are not creating their own columns. There is an ID column, and a column that says fact_value.c( the list of fact_id numbers that should be column names ). The fact_values are not showing up anywhere -- there are just NA values.
ID and fact_value are "characters" and fact_id is an "integer". I ran this as well, and got the same result:
widedata <- longdata %>%
reshape(idvar = "ID", v.names = "fact_value", timevar = as.character("fact_id"), direction = "wide")
I also ran this, and got the same result:
widedata <- longdata%>%
reshape(idvar = "ID", timevar = "fact_id", direction = "wide")
Any idea what could be happening/how to change my code?
We could use pivot_wider
:
library(tidyr)
pivot_wider(df, names_from = fact_id, values_from = fact_value
ID `1` `2` `3` `4` `5` `6`
<int> <chr> <chr> <chr> <chr> <chr> <chr>
1 1 a b a a a a
2 2 b a b NA NA NA
3 3 NA NA NA c NA NA
4 4 a b c NA NA NA
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.