简体   繁体   English

选择分类变量(列)可以具有2个值的子集

[英]Selecting a subset where a categorical variable (column) can have 2 values

My data consists of frequency tables listed under each other for different models and scenario's (ie variables). 我的数据包括针对不同模型和场景(即变量)彼此列出的频率表。 I want to make selections of this database to make graphs for each subset. 我想选择这个数据库来为每个子集制作图表。 Most of my variables are categorical and texts (eg weather, scenario). 我的大多数变量都是分类和文本(例如天气,场景)。 I couln't find a way to allow multiple values from a categorical variable (mostly %in% c() is used for numbers). 我无法找到一种方法来允许来自分类变量的多个值(大多数%in% c()用于数字)。 I tried the following: 我尝试了以下方法:

ThisSelection <- subset (Hist, all_seeds==0 & weather == "normal" & scenario %in% c("intact","depauperate"))

which doesn't work and 哪个不起作用

ThisSelection <- subset (Hist, all_seeds==0 & scenario =="intact" | scenario =="depauperate")

which gives only "inatct" scenarios. 它只提供“inatct”场景。

My apologies if the answer is simple here, I searched the web but couldn't find where I'm wrong, and I believe there must be an other way than turning string variable-values into numerical ones. 如果答案很简单,我很抱歉,我在网上搜索但找不到我错的地方,我相信除了将字符串变量值转换为数字值之外,还有其他方法。 I'm a starter in R... 我是R的首发...

Your first should work. 你的第一个应该工作。 Hesitate to suggest it but is your spelling of "depauperate" consistent (including case?): 犹豫不决,但你的拼写是“沮丧”一致(包括案例?):

Hist<-data.frame(all_seeds=0, weather=sample(c("normal","odd"),20,T),scenario=sample(c("intact","depauperate"),20,T))
ThisSelection <- subset (Hist, all_seeds==0 & weather == "normal" & scenario %in% c("intact","depauperate"))
ThisSelection


   all_seeds weather    scenario
1          0  normal      intact
3          0  normal      intact
4          0  normal      intact
5          0  normal depauperate
6          0  normal      intact
10         0  normal depauperate
14         0  normal      intact
15         0  normal      intact

Don't forget about logical operators priority: 不要忘记逻辑运算符的优先级:

set.seed(3099627)
Hist <- data.frame(first=sample(letters[1:3], 20, rep=T), second=sample(letters[4:6], 20, rep=T))
subset (Hist, first=="a" & (second=="d" | second=="e"))

   first second
1      a      e
4      a      d
15     a      e
20     a      d

subset (Hist, first=="a" & (second %in% c("d", "e")))

   first second
1      a      e
4      a      d
15     a      e
20     a      d

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM