[英]How can I reshape multiple columns of long data to wide?
structure(tibble(c("top", "jng", "mid", "bot", "sup"), c("369", "Karsa", "knight", "JackeyLove", "yuyanjia"),
c("Malphite", "Rek'Sai", "Zoe", "Aphelios", "Braum"), c("1", "1", "1", "1", "1"), c("7", "5", "7", "5", "0"),
c("6079-7578", "6079-7578", "6079-7578", "6079-7578", "6079-7578")), .Names = c("position", "player", "champion", "result", "kills", "gameid"))
输出:
# A tibble: 5 x 6
position player champion result kills gameid
* <chr> <chr> <chr> <chr> <chr> <chr>
1 top 369 Malphite 1 7 6079-7578
2 jng Karsa Rek'Sai 1 5 6079-7578
3 mid knight Zoe 1 7 6079-7578
4 bot JackeyLove Aphelios 1 5 6079-7578
5 sup yuyanjia Braum 1 0 6079-7578
我想要的输出是:
structure(list(gameid = "6079-7578", result = "1", player_top = "369",
player_jng = "Karsa", player_mid = "knight", player_bot = "JackeyLove",
player_sup = "yuyanjia", champion_top = "Malphite", champion_jng = "Rek'Sai",
champion_mid = "Zoe", champion_bot = "Aphelios", champion_sup = "Braum",
kills_top = "7", kills_jng = "5", kills_mid = "7", kills_bot = "5",
kills_sup = "0"), row.names = c(NA, -1L), class = c("tbl_df",
"tbl", "data.frame"))
看起来像这样:
gameid result player_top player_jng player_mid player_bot player_sup champion_top champion_jng champion_mid champion_bot champion_sup
1 6079-7578 1 369 Karsa knight JackeyLove yuyanjia Malphite RekSai Zoe Aphelios Braum
kills_top kills_jng kills_mid kills_bot kills_sup
1 7 5 7 5 0
我知道我应该使用pivot_wider() 和drop_na 之类的东西,但我不知道如何使用多个列执行pivot_wider() 并同时折叠行。 任何帮助,将不胜感激。
您可以pivot_wider()
使用pivot_wider()
,将“位置”变量定义为新列名称来自names_from
的变量,以及三个变量,其中包含要用于填充这些列的值 as values_from
。
默认情况下,多个values_from
变量被粘贴到新列名称的前面。 这可以改变,但在这种情况下,匹配你想要的命名结构。
原始数据集中的所有其他变量将按照它们出现的顺序用作id_cols
。
library(tidyr)
pivot_wider(dat,
names_from = "position",
values_from = c("player", "champion", "kills"))
#> result gameid player_top player_jng player_mid player_bot player_sup
#> 1 1 6079-7578 369 Karsa knight JackeyLove yuyanjia
#> champion_top champion_jng champion_mid champion_bot champion_sup kills_top
#> 1 Malphite Rek'Sai Zoe Aphelios Braum 7
#> kills_jng kills_mid kills_bot kills_sup
#> 1 5 7 5 0
您可以通过id_cols
显式写出它们来控制输出中 id 列的顺序。 这是一个示例,匹配您想要的输出。
pivot_wider(dat, id_cols = c("gameid", "result"),
names_from = "position",
values_from = c("player", "champion", "kills"))
#> gameid result player_top player_jng player_mid player_bot player_sup
#> 1 6079-7578 1 369 Karsa knight JackeyLove yuyanjia
#> champion_top champion_jng champion_mid champion_bot champion_sup kills_top
#> 1 Malphite Rek'Sai Zoe Aphelios Braum 7
#> kills_jng kills_mid kills_bot kills_sup
#> 1 5 7 5 0
在这里使用data.table
可能会有所帮助。 在dcast()
每一行都由一个唯一的 gameid 和 result 组合标识,列将按位置分布,并填充 value.var 中列出的变量的值。
library(data.table)
library(dplyr)
df <- structure(tibble(c("top", "jng", "mid", "bot", "sup"), c("369", "Karsa", "knight", "JackeyLove", "yuyanjia"),
c("Malphite", "Rek'Sai", "Zoe", "Aphelios", "Braum"), c("1", "1", "1", "1", "1"), c("7", "5", "7", "5", "0"),
c("6079-7578", "6079-7578", "6079-7578", "6079-7578", "6079-7578")), .Names = c("position", "player", "champion", "result", "kills", "gameid"))
df2 <- dcast(setDT(df), gameid + result~position, value.var = list('player','champion','kills'))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.