[英]Count the number of lines associated to a level factor based limiting by a maximum value of another variable
Sorry for the vague title but I did not know how to describe it shortly. 抱歉,标题含糊不清,但我不知道该如何形容。 I am working dataframe with many variables.
我正在使用许多变量的数据框。
Basically, two variables interest me : one is a factor vector with two levels (example green, red) the other one this a numerical continuous vector (example : concentration of pesticides) 基本上,有两个变量使我感兴趣:一个是具有两个水平的因子向量(例如绿色,红色),另一个是数字连续向量(例如:农药的浓度)
AppleColour PesticidesConcentration
1 green 1.45
2 red 3.50
3 green 1.56
4 red 54.30
5 red 53.20
6 red 53.40
7 green 2.50
8 green 6.70
9 red 32.05
10 green 34.27
I wanna count the number of green 1) when pesticides >4 but <50, 2) when pesticides >20. 我想计算绿色的数量:1)农药> 4时绿色,但<50,2)农药> 20时绿色。
df1 <- structure(list(AppleColour = structure(c(1L, 2L, 1L, 2L, 2L,
2L, 1L, 1L, 2L, 1L), .Label = c("green", "red"), class = "factor"),
PesticidesConcentration = c(1.45, 3.5, 1.56, 54.3, 53.2,
53.4, 2.5, 6.7, 32.05, 34.27)), .Names = c("AppleColour",
"PesticidesConcentration"), class = "data.frame", row.names = c("1",
"2", "3", "4", "5", "6", "7", "8", "9", "10"))
We can create logical vector with ==
, &
, >
, <
and get the sum
of TRUE values. 我们可以使用
==
, &
, >
, <
创建逻辑矢量,并获得TRUE值的sum
。
with(df1, sum(AppleColour=="green" &
PesticidesConcentration > 4 & PesticidesConcentration <50 &
!is.na(PesticidesConcentration)))
with(df1, sum(AppleColour == "green" &
PesticidesConcentration > 20 &
!is.na(PesticidesConcentration)))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.