R - 根据另一个数据框列中的值满足的条件在数据框列中添加值（由公式导出）

Question

Here is an example dataset:这是一个示例数据集：

data = data.frame('Cat' = c('A', 'A', 'A', 'B', 'B', 'C', 'C', 'C', 'C', 'C'),
                  'Value' = c(1,1,1,2,2,3,3,3,3,3))
data

Another dataframe:另一个数据框：

a = data.frame('Name' = c('A', 'B', 'C', 'D'))

Desired output:期望的输出：

I want to understand how to give reference of another cell within the same row of a dataframe, and perform some function using the value of that cell.我想了解如何在数据帧的同一行中引用另一个单元格，并使用该单元格的值执行某些功能。

This worked for "In Data":这个工作对“数据”：

a[,'In Data?'] = ifelse(a$Name %in% unique(data$Cat), "Y", "N")

This failed for median:这对于中位数失败了：

b$Median = median(data$Cat[data$Cat == a$Name])

Error message:
Error in Ops.factor(data$Cat, a$Name) : 
  level sets of factors are different

This failed for count:计数失败：

a$Count = ifelse(a$Name %in% unique(data$Cat), length(data$Cat==a$Name), 0)

Error:
Error in Ops.factor(data$Cat, a$Name) : 
  level sets of factors are different

. . . . 2nd Dataframe columns :第二个数据框列：

Cat : ABCD猫：ABCD
count :数数：
proportion :部分：
median :中位数：
values > median :值 > 中位数：
f(x) : {count + 10} f(x) : {计数 + 10}
In Data?在数据？ : ：

Answer 1

It's better to frame these operations as merging and summarizing.最好将这些操作定义为合并和汇总。 (Talking in terms of cells and rows seem very Excel-like rather than R-like). （就单元格和行而言，似乎非常像 Excel 而不是像 R 语言）。 The dplyr package helps a lot here dplyr包在这里有很大帮助

library(dplyr)
a %>% 
  left_join(data, by=c("Name"="Cat")) %>% 
  group_by(Name) %>% 
  summarize(
    Count=sum(!is.na(Value)),
    Median=median(Value),
    ValuesGtMed=sum(Value>Median),
    f = Count+10,
    InData = if_else(Count>0, "Y","N")
  ) %>% 
  mutate(Proportion=Count/sum(Count))

The left_join makes sure we get all values in a and then we just use different summary functions per the groups defined by Name该left_join确保我们得到的所有值a ，然后我们只需每月通过定义的组使用不同的汇总函数Name

Output:输出：

  Name  Count Median ValuesGtMed     f InData Proportion
  <chr> <int>  <dbl>       <int> <dbl> <chr>       <dbl>
1 A         3      1           0    13 Y             0.3
2 B         2      2           0    12 Y             0.2
3 C         5      3           0    15 Y             0.5
4 D         0     NA          NA    10 N             0

R - 根据另一个数据框列中的值满足的条件在数据框列中添加值（由公式导出）

问题描述

1 个解决方案

解决方案1
2 已采纳 2020-08-28 04:30:53

R - 根据另一个数据框列中的值满足的条件在数据框列中添加值（由公式导出）

问题描述

1 个解决方案

解决方案1 2 已采纳 2020-08-28 04:30:53

解决方案1
2 已采纳 2020-08-28 04:30:53