简体   繁体   English

如何将特定行转换为R中的列?

[英]How to convert specific rows into columns in r?

I have a df in R of only one column of food ratings from amazon. 我在R中的df仅来自亚马逊的一列食品等级。

head(food_ratings)
          product.productId..B001E4KFG0
1         review/userId: A3SGXH7AUHU8GW
2        review/profileName: delmartian
3               review/helpfulness: 1/1
4                     review/score: 5.0
5               review/time: 1303862400
6 review/summary: Good Quality Dog Food

The rows repeat themselves, so that rows 7 through 12 have the same information regarding another user(row 7). 这些行会重复,因此行7到12具有关于另一个用户的相同信息(行7)。 This pattern is repeated many times. 此模式重复了很多次。

Therefore, I need to have every group of 6 rows distributed in one row with 6 columns, so that later I can subset, for instance, the review/summary according to their review/score. 因此,我需要将每6行中的每组分配为6列的一行,以便以后我可以根据其评论/分数将其作为子集。

I'm using RStudio 1.0.143 我正在使用RStudio 1.0.143

EDIT: I was asked to show the output of dput(head(food_ratings, 24)) but it was too big regardless of the number used. 编辑:我被要求显示dput(head(food_ratings, 24))的输出,但是无论使用多少,它都太大了。 Thanks a lot 非常感谢

I have taken your data and added 2 more fake users to it. 我已收集了您的数据,并添加了2个其他虚假用户。 Using tidyr and dplyr you can create new columns and collapse the data into a nice data.frame. 使用tidyrdplyr可以创建新列,并将数据折叠到一个不错的data.frame中。 You can use select from dplyr to drop the id column if you don't need it or to rearrange the order of the columns. 如果不需要,可以使用dplyr select删除id列或重新排列列的顺序。

library(tidyr)
library(dplyr)

df %>% 
  separate(product.productId..B001E4KFG0, into = c("details", "data"), sep = ": ") %>% 
  mutate(details = sub("review/ ", "", details)) %>% 
  group_by(details) %>% 
  mutate(id = row_number()) %>% 
  spread(details, data)


# A tibble: 3 x 7
     id helpfulness profileName score summary                 time       userId        
  <int> <chr>       <chr>       <chr> <chr>                   <chr>      <chr>         
1     1 1/1         delmartian  5.0   Good Quality Dog Food   1303862400 A3SGXH7AUHU8GW
2     2 1/1         martian2    1.0   Good Quality Snake Food 1303862400 123456        
3     3 2/5         martian3    5.0   Good Quality Cat Food   1303862400 123654  

data: 数据:

df <- structure(list(product.productId..B001E4KFG0 = c("review/userId: A3SGXH7AUHU8GW", 
"review/profileName: delmartian", "review/helpfulness: 1/1", 
"review/score: 5.0", "review/time: 1303862400", "review/summary: Good Quality Dog Food", 
"review/userId: 123456", "review/profileName: martian2", "review/helpfulness: 1/1", 
"review/score: 1.0", "review/time: 1303862400", "review/summary: Good Quality Snake Food", 
"review/userId: 123654", "review/profileName: martian3", "review/helpfulness: 2/5", 
"review/score: 5.0", "review/time: 1303862400", "review/summary: Good Quality Cat Food"
)), class = "data.frame", row.names = c(NA, -18L))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM