简体   繁体   English


[英]How to convert specific rows into columns in r?

I have a df in R of only one column of food ratings from amazon. 我在R中的df仅来自亚马逊的一列食品等级。

1         review/userId: A3SGXH7AUHU8GW
2        review/profileName: delmartian
3               review/helpfulness: 1/1
4                     review/score: 5.0
5               review/time: 1303862400
6 review/summary: Good Quality Dog Food

The rows repeat themselves, so that rows 7 through 12 have the same information regarding another user(row 7). 这些行会重复,因此行7到12具有关于另一个用户的相同信息(行7)。 This pattern is repeated many times. 此模式重复了很多次。

Therefore, I need to have every group of 6 rows distributed in one row with 6 columns, so that later I can subset, for instance, the review/summary according to their review/score. 因此,我需要将每6行中的每组分配为6列的一行,以便以后我可以根据其评论/分数将其作为子集。

I'm using RStudio 1.0.143 我正在使用RStudio 1.0.143

EDIT: I was asked to show the output of dput(head(food_ratings, 24)) but it was too big regardless of the number used. 编辑:我被要求显示dput(head(food_ratings, 24))的输出,但是无论使用多少,它都太大了。 Thanks a lot 非常感谢

I have taken your data and added 2 more fake users to it. 我已收集了您的数据,并添加了2个其他虚假用户。 Using tidyr and dplyr you can create new columns and collapse the data into a nice data.frame. 使用tidyrdplyr可以创建新列,并将数据折叠到一个不错的data.frame中。 You can use select from dplyr to drop the id column if you don't need it or to rearrange the order of the columns. 如果不需要,可以使用dplyr select删除id列或重新排列列的顺序。


df %>% 
  separate(product.productId..B001E4KFG0, into = c("details", "data"), sep = ": ") %>% 
  mutate(details = sub("review/ ", "", details)) %>% 
  group_by(details) %>% 
  mutate(id = row_number()) %>% 
  spread(details, data)

# A tibble: 3 x 7
     id helpfulness profileName score summary                 time       userId        
  <int> <chr>       <chr>       <chr> <chr>                   <chr>      <chr>         
1     1 1/1         delmartian  5.0   Good Quality Dog Food   1303862400 A3SGXH7AUHU8GW
2     2 1/1         martian2    1.0   Good Quality Snake Food 1303862400 123456        
3     3 2/5         martian3    5.0   Good Quality Cat Food   1303862400 123654  

data: 数据:

df <- structure(list(product.productId..B001E4KFG0 = c("review/userId: A3SGXH7AUHU8GW", 
"review/profileName: delmartian", "review/helpfulness: 1/1", 
"review/score: 5.0", "review/time: 1303862400", "review/summary: Good Quality Dog Food", 
"review/userId: 123456", "review/profileName: martian2", "review/helpfulness: 1/1", 
"review/score: 1.0", "review/time: 1303862400", "review/summary: Good Quality Snake Food", 
"review/userId: 123654", "review/profileName: martian3", "review/helpfulness: 2/5", 
"review/score: 5.0", "review/time: 1303862400", "review/summary: Good Quality Cat Food"
)), class = "data.frame", row.names = c(NA, -18L))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

粤ICP备18138465号  © 2020-2024 STACKOOM.COM