简体   繁体   English

根据同一行中的其他值选择值

[英]Select value based on other value in the same row

I have a dataframe (df) in where I have the colums date, winner, loser, WinnerRank and Loserrank.我有一个数据框 (df),其中包含列日期、获胜者、失败者、WinnerRank 和 Loserrank。 The winnerrank is the rank of the person from the column winner, and the same goes for loserrank. winnerrank 是列中获胜者的排名,loserrank 也是如此。 I want to have a new dataframe with the date, name and rank.我想要一个包含日期、名称和等级的新数据框。 But the problem is that the name I want can be in both colums, winner and Loser.但问题是我想要的名字可以在两个列中,赢家和输家。 If the name I want is in the winner column I would like to have the Winnerrank, but if the name is in the Loser I want to have the loserrank.如果我想要的名字在获胜者列中,我想要Winnerrank,但如果名字在Loser 中,我想要失败者等级。 How do I do this?我该怎么做呢?

the df looks like this: df 看起来像这样:

        Date       Winner          Loser WRank LRank
1 2000-01-03   Federer R. Knippschild J.    65    87
2 2000-01-03   Enqvist T.     Federer R.     5    65
3 2000-01-10 Ferrero J.C.     Federer R.    45    61
4 2000-01-17   Federer R.       Chang M.    62    38
5 2000-01-17   Federer R.     Kroslak J.    62   104
6 2000-01-17   Clement A.     Federer R.    54    62

And the format I want looks like this:我想要的格式如下所示:

        Date       Name    Rank 
1 2000-01-03   Federer R.  65
2 2000-01-03   Federer R.  65   
3 2000-01-10   Federer R.  61   
4 2000-01-17   Federer R.  62    
5 2000-01-17   Federer R.  62   
6 2000-01-17   Federer R.  62   

We can use the functions found in the tidyverse package:我们可以使用tidyverse包中的函数:

library(tidyverse)
dat %>%
  # create single winner and loser columns,
  # concatenating name and rank together
  unite(Winner, Winner, WRank, sep = "-") %>%
  unite(Loser, Loser, LRank, sep = "-") %>%
  # pivot to be "tall"
  pivot_longer(cols = c("Winner", "Loser")) %>%
  select(-name) %>%
  # reverse concatentation
  separate(value, into = c("Name", "Rank"), sep = "-")

 #   Date       Name           Rank 
 # 1 2000-01-03 Federer_R.     65   
 # 2 2000-01-03 Knippschild J. 87   
 # 3 2000-01-03 Enqvist_T.     5    
 # 4 2000-01-03 Federer R.     65   
 # 5 2000-01-10 Ferrero_J.C.   45   
 # 6 2000-01-10 Federer R.     61   
 # 7 2000-01-17 Federer_R.     62   
 # 8 2000-01-17 Chang M.       38   
 # 9 2000-01-17 Federer_R.     62   
 #10 2000-01-17 Kroslak J.     104  
 #11 2000-01-17 Clement_A.     54   
 #12 2000-01-17 Federer R.     62   

One thing to note is that this will convert your Rank to a character value.需要注意的一件事是,这会将您的Rank转换为字符值。 You can reverse that using the as.numeric function.您可以使用as.numeric函数反转它。

May be a function helps in extracting those rows and values based on the player name.可能是一个函数有助于根据玩家姓名提取这些行和值。 We filter the rows where the player name is in either 'Winner' or |我们filter玩家名称在“Winner”或| 'Loser' column, then use transmute to create the three column output by selecting the 'Date', 'Name' as the input player name, and 'Rank' by creating a logical matrix by comparing the subset of columns 'Winner', 'Loser' with player name, feed that output into max.col to get the index of max value ie TRUE => 1 and FALSE => 0 for each row, cbind with row index ( row_number ) and use that to extract the corresponding elements from subset of dataset with 'WRank', 'LRank' columns 'Loser' 列,然后使用transmute通过选择 'Date'、'Name' 作为输入玩家名称来创建三列输出,并通过比较列 'Winner'、' 的子集来创建逻辑矩阵来创建 'Rank' Loser' 带有玩家名称,将该输出输入max.col以获得最大值的索引,即每行 TRUE => 1 和 FALSE => 0,与行索引 ( row_number ) cbind ,并使用它从具有“WRank”、“LRank”列的数据集子集

f1 <- function(dat, nm) {
      dat %>%
        filter(Winner == nm|Loser == nm) %>%
            transmute(Date, Name = nm, 
        Rank = .[c('WRank', 'LRank')][cbind(row_number(),
             max.col(.[c('Winner', 'Loser')] == nm))])

}

-testing -测试

f1(df1, 'Federer R.')
#        Date       Name Rank
#1 2000-01-03 Federer R.   65
#2 2000-01-03 Federer R.   65
#3 2000-01-10 Federer R.   61
#4 2000-01-17 Federer R.   62
#5 2000-01-17 Federer R.   62
#6 2000-01-17 Federer R.   62

data数据

df1 <- structure(list(Date = c("2000-01-03", "2000-01-03", "2000-01-10", 
"2000-01-17", "2000-01-17", "2000-01-17"), Winner = c("Federer R.", 
"Enqvist T.", "Ferrero J.C.", "Federer R.", "Federer R.", "Clement A."
), Loser = c("Knippschild J.", "Federer R.", "Federer R.", "Chang M.", 
"Kroslak J.", "Federer R."), WRank = c(65L, 5L, 45L, 62L, 62L, 
54L), LRank = c(87L, 65L, 61L, 38L, 104L, 62L)),
class = "data.frame", row.names = c("1", 
"2", "3", "4", "5", "6"))

You can combine both winner and loser in one column and rank into one column and then select the player name.您可以将赢家和输家合并在一列并排入一列,然后选择玩家名称。

library(dplyr)

player_name <- 'Federer R.'

df %>%
  rename_with(~paste0(., '_Name'), c(Winner, Loser)) %>%
  rename_with(~paste0(., '_Rank'), ends_with('Rank')) %>%
  tidyr::pivot_longer(cols = -Date, 
               names_pattern = '.*_(\\w+)', 
               names_to = '.value') %>%
  filter(Name == player_name)

#   Date       Name        Rank
#  <chr>      <chr>      <int>
#1 2000-01-03 Federer R.    65
#2 2000-01-03 Federer R.    65
#3 2000-01-10 Federer R.    61
#4 2000-01-17 Federer R.    62
#5 2000-01-17 Federer R.    62
#6 2000-01-17 Federer R.    62

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM