简体   繁体   English

计算与基于级别因子的限制相关联的行数,该限制取决于另一个变量的最大值

[英]Count the number of lines associated to a level factor based limiting by a maximum value of another variable

Sorry for the vague title but I did not know how to describe it shortly. 抱歉,标题含糊不清,但我不知道该如何形容。 I am working dataframe with many variables. 我正在使用许多变量的数据框。

Basically, two variables interest me : one is a factor vector with two levels (example green, red) the other one this a numerical continuous vector (example : concentration of pesticides) 基本上,有两个变量使我感兴趣:一个是具有两个水平的因子向量(例如绿色,红色),另一个是数字连续向量(例如:农药的浓度)

   AppleColour PesticidesConcentration
1        green                    1.45
2          red                    3.50
3        green                    1.56
4          red                   54.30
5          red                   53.20
6          red                   53.40
7        green                    2.50
8        green                    6.70
9          red                   32.05
10       green                   34.27

I wanna count the number of green 1) when pesticides >4 but <50, 2) when pesticides >20. 我想计算绿色的数量:1)农药> 4时绿色,但<50,2)农药> 20时绿色。

df1 <- structure(list(AppleColour = structure(c(1L, 2L, 1L, 2L, 2L, 
2L, 1L, 1L, 2L, 1L), .Label = c("green", "red"), class = "factor"), 
    PesticidesConcentration = c(1.45, 3.5, 1.56, 54.3, 53.2, 
    53.4, 2.5, 6.7, 32.05, 34.27)), .Names = c("AppleColour", 
"PesticidesConcentration"), class = "data.frame", row.names = c("1", 
"2", "3", "4", "5", "6", "7", "8", "9", "10"))

We can create logical vector with == , & , > , < and get the sum of TRUE values. 我们可以使用==&><创建逻辑矢量,并获得TRUE值的sum

with(df1, sum(AppleColour=="green" &
    PesticidesConcentration > 4 & PesticidesConcentration <50 &
             !is.na(PesticidesConcentration)))

with(df1, sum(AppleColour == "green" &
        PesticidesConcentration  > 20 & 
        !is.na(PesticidesConcentration)))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM