在r中使用data.table基于组分配值

Question

I have the following data set: 我有以下数据集：

Name         Make_Miss       Half        
Player A         1             1                
Player B         1             1                
Player A         0             2                
Player A         0             1                
Player A         1             1                
Player B         0             2

Where Name is the player's name, Make_Miss is whether or not the player made that shot, and Half is which half the shot took place. 其中Name是玩家的名字，Make_Miss是玩家是否进行了投篮，Half是投篮的哪一半。 I am currently using the following code in order to compute the first half made shots count. 我目前正在使用以下代码来计算上半场的命中率。

Code: 码：

dt[ , Player_First_Made := .N, by = list(dt$Name == "Player A" & dt$Half == 1 & dt$Make_Miss == 1)]

The output: 输出：

Name         Make_Miss       Half        Player_First_Made
Player A         1             1                2
Player B         1             1                4
Player A         0             2                4
Player A         0             1                4
Player A         1             1                2
Player B         0             2                4

What is happening here is that wherever Player A has the input of 0 in the Make_Miss column, then the respective row in the Player_First_Made column gets assigned the value of the count of shots that do not match criteria in the list (ie Name != Player A or Half != 1 or Make_Miss != 1); 这里发生的是，只要播放器A在Make_Miss列中的输入为0，那么Player_First_Made列中的相应行就会被分配与列表中的条件不匹配的镜头计数值（即Name！= Player A或Half！= 1或Make_Miss！= 1）; however, my desire is the following: 但是，我的愿望是：

Name         Make_Miss       Half        Player_First_Made
Player A         1             1                2
Player B         1             1                4
Player A         0             2                2
Player A         0             1                2
Player A         1             1                2
Player B         0             2                4

I want the rows that match Name = Player A to always have the value of however many shots they made in the first half. 我希望与名称=玩家A匹配的行始终具有上半场他们进行的多次射击的价值。 Is there some sort of syntax for data.table that I can specify this assignment? 我可以指定此分配的data.table语法吗？

Answer 1

As @chinsoon12 points out, the data you have provided don't really make sense. 正如@ chinsoon12指出的那样，您提供的数据实际上没有任何意义。 However, here is a method using dplyr which I think will give you what you want... 但是，这是一种使用dplyr的方法，我认为它将为您提供所需的...

library(dplyr)

# Make some data
DATA <- data.frame(Name = c("Player A", "Player B", "Player C",
 "Player A", "Player A", "Player B"), Make_Miss = c(1,1,0,0,1,0),
 Half = c(1,1,2,1,2,2))

# Use dplyr to calculate the sums of 'Half' for each player
OUT <- DATA %>% group_by(Name) %>% mutate(Player_First_Made = sum(Half))

# Check the output
> OUT
# A tibble: 6 x 4
# Groups:   Name [3]
  Name     Make_Miss  Half Player_First_Made
  <fct>        <dbl> <dbl>             <dbl>
1 Player A      1     1                 4
2 Player B      1     1                 3 
3 Player C      0     2                 2
4 Player A      0     1                 4 
5 Player A      1     2                 4 
6 Player B      0     2                 3

If this isn't what you are looking for, then please edit your question to make it clearer. 如果这不是您想要的内容，请编辑您的问题以使其更清楚。

Answer 2

A data.table way to do this would be: 一种执行此操作的data.table方法是：

dat[Half == 1, .(Player_First_Made = sum(Make_Miss)), .(Name)
    ][dat, on = c('Name')]

Where the first line counts the number of times ( sum(Make_Miss) ) each player ( .(Name) ) made a shot in the first half ( Half == 1 ). 第一行计算每个玩家（ .(Name) ）在上半场（ Half == 1 ）投篮的次数（ sum(Make_Miss) ）。

The second line joins the resulting aggregated table from the step above back into the original dataset. 第二行将上述步骤中生成的汇总表连接回原始数据集中。

Here's the sample data I used: 这是我使用的示例数据：

dat <-
  data.table(
    Name = c('A', 'B'),
    Make_Miss = round(runif(30, 0, 1)),
    Half = round(runif(30, 1, 2))
  )

在r中使用data.table基于组分配值

问题描述

2 个解决方案

解决方案1
1 2018-03-29 10:31:42

解决方案2
1 2018-03-30 21:40:40

在r中使用data.table基于组分配值

问题描述

2 个解决方案

解决方案1 1 2018-03-29 10:31:42

解决方案2 1 2018-03-30 21:40:40

解决方案1
1 2018-03-29 10:31:42

解决方案2
1 2018-03-30 21:40:40