[英]How to convert specific rows into columns in r?
我在R中的df仅来自亚马逊的一列食品等级。
head(food_ratings)
product.productId..B001E4KFG0
1 review/userId: A3SGXH7AUHU8GW
2 review/profileName: delmartian
3 review/helpfulness: 1/1
4 review/score: 5.0
5 review/time: 1303862400
6 review/summary: Good Quality Dog Food
这些行会重复,因此行7到12具有关于另一个用户的相同信息(行7)。 此模式重复了很多次。
因此,我需要将每6行中的每组分配为6列的一行,以便以后我可以根据其评论/分数将其作为子集。
我正在使用RStudio 1.0.143
编辑:我被要求显示dput(head(food_ratings, 24))
的输出,但是无论使用多少,它都太大了。 非常感谢
我已收集了您的数据,并添加了2个其他虚假用户。 使用tidyr
和dplyr
可以创建新列,并将数据折叠到一个不错的data.frame中。 如果不需要,可以使用dplyr
select
删除id列或重新排列列的顺序。
library(tidyr)
library(dplyr)
df %>%
separate(product.productId..B001E4KFG0, into = c("details", "data"), sep = ": ") %>%
mutate(details = sub("review/ ", "", details)) %>%
group_by(details) %>%
mutate(id = row_number()) %>%
spread(details, data)
# A tibble: 3 x 7
id helpfulness profileName score summary time userId
<int> <chr> <chr> <chr> <chr> <chr> <chr>
1 1 1/1 delmartian 5.0 Good Quality Dog Food 1303862400 A3SGXH7AUHU8GW
2 2 1/1 martian2 1.0 Good Quality Snake Food 1303862400 123456
3 3 2/5 martian3 5.0 Good Quality Cat Food 1303862400 123654
数据:
df <- structure(list(product.productId..B001E4KFG0 = c("review/userId: A3SGXH7AUHU8GW",
"review/profileName: delmartian", "review/helpfulness: 1/1",
"review/score: 5.0", "review/time: 1303862400", "review/summary: Good Quality Dog Food",
"review/userId: 123456", "review/profileName: martian2", "review/helpfulness: 1/1",
"review/score: 1.0", "review/time: 1303862400", "review/summary: Good Quality Snake Food",
"review/userId: 123654", "review/profileName: martian3", "review/helpfulness: 2/5",
"review/score: 5.0", "review/time: 1303862400", "review/summary: Good Quality Cat Food"
)), class = "data.frame", row.names = c(NA, -18L))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.