[英]Combining rows into one row with multiple columns of data R
I have a dataset that measures pre and post weight conditions.我有一个测量前后体重条件的数据集。 I was trying to combine rows with similar ID's to one row and create two additional columns to show the results for pre and post for each ID.
我试图将具有相似 ID 的行合并为一行,并创建两个额外的列来显示每个 ID 的 pre 和 post 的结果。
Original table原表
ID Sex Race TIME SH PRE_WEIGHT POST_WEIGHT 1 FEMALE White POST Poor 168 152 1 FEMALE White PRE Good 168 152 2 MALE Hispanic POST Very Poor 157 143 2 MALE Hispanic PRE Poor 157 143 3 FEMALE Black POST Good 129 118 3 FEMALE Black PRE Satisfactory 129 118
Results should look like this:结果应如下所示:
ID Sex Race TIME SH PRE_WEIGHT POST_WEIGHT TIME_PRE TIME_POST SH_PRE SH_POST 1 FEMALE White POST Poor 168 152 POST PRE Poor Good 2 MALE Hispanic POST Very Poor 157 143 POST PRE Very Poor Poor 3 FEMALE Black POST Good 129 118 POST PRE Good Satisfactor
I tried to do it with reshape which is very slow and didn't work.我试着用 reshape 来做,这很慢而且没有用。 I also wasn't able to reshape the table with tidyr.
我也无法用 tidyr 重塑桌子。
Hope anyone can help me with this.希望任何人都可以帮助我解决这个问题。
An option with data.table
带有
data.table
的选项
library(data.table)
dcast(setDT(df1), ID + Sex + Race + PRE_WEIGHT +
POST_WEIGHT ~ TIME, value.var = 'SH')
-output -输出
# ID Sex Race PRE_WEIGHT POST_WEIGHT POST PRE
#1: 1 FEMALE White 168 152 Poor Good
2#: 2 MALE Hispanic 157 143 Very Poor Poor
3#: 3 FEMALE Black 129 118 Good Satisfactory
A simple base R option might be using reshape
, eg,一个简单的基本 R 选项可能是使用
reshape
,例如,
reshape(
df,
direction = "wide",
idvar = c("ID","Sex","Race"),
timevar = "TIME"
)
which gives这使
ID Sex Race SH.POST PRE_WEIGHT.POST POST_WEIGHT.POST SH.PRE
1 1 FEMALE White Poor 168 152 Good
3 2 MALE Hispanic Very Poor 157 143 Poor
5 3 FEMALE Black Good 129 118 Satisfactory
PRE_WEIGHT.PRE POST_WEIGHT.PRE
1 168 152
3 157 143
5 129 118
Data数据
> dput(df)
structure(list(ID = c(1L, 1L, 2L, 2L, 3L, 3L), Sex = c("FEMALE",
"FEMALE", "MALE", "MALE", "FEMALE", "FEMALE"), Race = c("White",
"White", "Hispanic", "Hispanic", "Black", "Black"), TIME = c("POST",
"PRE", "POST", "PRE", "POST", "PRE"), SH = c("Poor", "Good",
"Very Poor", "Poor", "Good", "Satisfactory"), PRE_WEIGHT = c(168L,
168L, 157L, 157L, 129L, 129L), POST_WEIGHT = c(152L, 152L, 143L,
143L, 118L, 118L)), class = "data.frame", row.names = c(NA, -6L
))
As @Duck mentioned you can do this正如@Duck 提到的,你可以这样做
library(dplyr)
library(tidyr)
df3 %>%
group_by(ID) %>%
pivot_wider(names_from = TIME,
values_from = SH,
names_prefix = "SH_")
#> # A tibble: 3 x 7
#> # Groups: ID [3]
#> ID Sex Race PRE_WEIGHT POST_WEIGHT SH_POST SH_PRE
#> <dbl> <chr> <chr> <dbl> <dbl> <chr> <chr>
#> 1 1 FEMALE White 168 152 Poor Good
#> 2 2 MALE Hispanic 157 143 Very Poor Poor
#> 3 3 FEMALE Black 129 118 Good Satisfactory
Your data您的数据
df3 <- readr::read_table("
ID Sex Race TIME SH PRE_WEIGHT POST_WEIGHT
1 FEMALE White POST Poor 168 152
1 FEMALE White PRE Good 168 152
2 MALE Hispanic POST Very Poor 157 143
2 MALE Hispanic PRE Poor 157 143
3 FEMALE Black POST Good 129 118
3 FEMALE Black PRE Satisfactory 129 118")
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.