[英]Select value based on other value in the same row
I have a dataframe (df) in where I have the colums date, winner, loser, WinnerRank and Loserrank.我有一个数据框 (df),其中包含列日期、获胜者、失败者、WinnerRank 和 Loserrank。 The winnerrank is the rank of the person from the column winner, and the same goes for loserrank. winnerrank 是列中获胜者的排名,loserrank 也是如此。 I want to have a new dataframe with the date, name and rank.我想要一个包含日期、名称和等级的新数据框。 But the problem is that the name I want can be in both colums, winner and Loser.但问题是我想要的名字可以在两个列中,赢家和输家。 If the name I want is in the winner column I would like to have the Winnerrank, but if the name is in the Loser I want to have the loserrank.如果我想要的名字在获胜者列中,我想要Winnerrank,但如果名字在Loser 中,我想要失败者等级。 How do I do this?我该怎么做呢?
the df looks like this: df 看起来像这样:
Date Winner Loser WRank LRank
1 2000-01-03 Federer R. Knippschild J. 65 87
2 2000-01-03 Enqvist T. Federer R. 5 65
3 2000-01-10 Ferrero J.C. Federer R. 45 61
4 2000-01-17 Federer R. Chang M. 62 38
5 2000-01-17 Federer R. Kroslak J. 62 104
6 2000-01-17 Clement A. Federer R. 54 62
And the format I want looks like this:我想要的格式如下所示:
Date Name Rank
1 2000-01-03 Federer R. 65
2 2000-01-03 Federer R. 65
3 2000-01-10 Federer R. 61
4 2000-01-17 Federer R. 62
5 2000-01-17 Federer R. 62
6 2000-01-17 Federer R. 62
We can use the functions found in the tidyverse
package:我们可以使用tidyverse
包中的函数:
library(tidyverse)
dat %>%
# create single winner and loser columns,
# concatenating name and rank together
unite(Winner, Winner, WRank, sep = "-") %>%
unite(Loser, Loser, LRank, sep = "-") %>%
# pivot to be "tall"
pivot_longer(cols = c("Winner", "Loser")) %>%
select(-name) %>%
# reverse concatentation
separate(value, into = c("Name", "Rank"), sep = "-")
# Date Name Rank
# 1 2000-01-03 Federer_R. 65
# 2 2000-01-03 Knippschild J. 87
# 3 2000-01-03 Enqvist_T. 5
# 4 2000-01-03 Federer R. 65
# 5 2000-01-10 Ferrero_J.C. 45
# 6 2000-01-10 Federer R. 61
# 7 2000-01-17 Federer_R. 62
# 8 2000-01-17 Chang M. 38
# 9 2000-01-17 Federer_R. 62
#10 2000-01-17 Kroslak J. 104
#11 2000-01-17 Clement_A. 54
#12 2000-01-17 Federer R. 62
One thing to note is that this will convert your Rank
to a character value.需要注意的一件事是,这会将您的Rank
转换为字符值。 You can reverse that using the as.numeric
function.您可以使用as.numeric
函数反转它。
May be a function helps in extracting those rows and values based on the player name.可能是一个函数有助于根据玩家姓名提取这些行和值。 We filter
the rows where the player name is in either 'Winner' or |
我们filter
玩家名称在“Winner”或|
'Loser' column, then use transmute
to create the three column output by selecting the 'Date', 'Name' as the input player name, and 'Rank' by creating a logical matrix by comparing the subset of columns 'Winner', 'Loser' with player name, feed that output into max.col
to get the index of max value ie TRUE => 1 and FALSE => 0 for each row, cbind
with row index ( row_number
) and use that to extract the corresponding elements from subset of dataset with 'WRank', 'LRank' columns 'Loser' 列,然后使用transmute
通过选择 'Date'、'Name' 作为输入玩家名称来创建三列输出,并通过比较列 'Winner'、' 的子集来创建逻辑矩阵来创建 'Rank' Loser' 带有玩家名称,将该输出输入max.col
以获得最大值的索引,即每行 TRUE => 1 和 FALSE => 0,与行索引 ( row_number
) cbind
,并使用它从具有“WRank”、“LRank”列的数据集子集
f1 <- function(dat, nm) {
dat %>%
filter(Winner == nm|Loser == nm) %>%
transmute(Date, Name = nm,
Rank = .[c('WRank', 'LRank')][cbind(row_number(),
max.col(.[c('Winner', 'Loser')] == nm))])
}
-testing -测试
f1(df1, 'Federer R.')
# Date Name Rank
#1 2000-01-03 Federer R. 65
#2 2000-01-03 Federer R. 65
#3 2000-01-10 Federer R. 61
#4 2000-01-17 Federer R. 62
#5 2000-01-17 Federer R. 62
#6 2000-01-17 Federer R. 62
df1 <- structure(list(Date = c("2000-01-03", "2000-01-03", "2000-01-10",
"2000-01-17", "2000-01-17", "2000-01-17"), Winner = c("Federer R.",
"Enqvist T.", "Ferrero J.C.", "Federer R.", "Federer R.", "Clement A."
), Loser = c("Knippschild J.", "Federer R.", "Federer R.", "Chang M.",
"Kroslak J.", "Federer R."), WRank = c(65L, 5L, 45L, 62L, 62L,
54L), LRank = c(87L, 65L, 61L, 38L, 104L, 62L)),
class = "data.frame", row.names = c("1",
"2", "3", "4", "5", "6"))
You can combine both winner and loser in one column and rank into one column and then select the player name.您可以将赢家和输家合并在一列并排入一列,然后选择玩家名称。
library(dplyr)
player_name <- 'Federer R.'
df %>%
rename_with(~paste0(., '_Name'), c(Winner, Loser)) %>%
rename_with(~paste0(., '_Rank'), ends_with('Rank')) %>%
tidyr::pivot_longer(cols = -Date,
names_pattern = '.*_(\\w+)',
names_to = '.value') %>%
filter(Name == player_name)
# Date Name Rank
# <chr> <chr> <int>
#1 2000-01-03 Federer R. 65
#2 2000-01-03 Federer R. 65
#3 2000-01-10 Federer R. 61
#4 2000-01-17 Federer R. 62
#5 2000-01-17 Federer R. 62
#6 2000-01-17 Federer R. 62
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.