简体   繁体   中英

Subsetting five or more columns with different conditions R data.table

I have a data.table that looks like this:

 COUNTRY   GENDER     CURRENCY    INCOME_GROUP    YEAR  
 FRANCE     MAN       EURO            HIGH        2014  
 GERMANY    WOMEN     EURO            LOW         2015  
 FINLAND    MAN       EURO            LOW         2016  
 JAPAN      MAN       YEN             HIGH        2017  
 USA        WOMEN     DOLLAR          LOW         2018  

I want to subset this table with this code: datanew <- data[data$YEAR == "2014"& data$CURRENCY == "DOLLAR" & data$COUNTRY == FRANCE & data$INCOME_GROUP == LOW] but whenever I add three or more condition datanew variable always has "0" observation. I mean I can not add 4 or more conditions. Is there any way to solve this problem? Thanks for your help.

I'm presuming you are wanting to subset rows which fulfil the criteria you gave? In that case, there aren't any rows which satisfy your criteria.

If you try:

data = fread('COUNTRY   GENDER     CURRENCY    INCOME_GROUP    YEAR  
 FRANCE     MAN       EURO            HIGH        2014  
 GERMANY    WOMEN     EURO            LOW         2015  
 FINLAND    MAN       EURO            LOW         2016  
 JAPAN      MAN       YEN             HIGH        2017  
 USA        WOMEN     DOLLAR          LOW         2018  
')

data[YEAR == "2014" & CURRENCY == "EURO" & COUNTRY == "FRANCE" & INCOME_GROUP == "HIGH"]

returns:

   COUNTRY GENDER CURRENCY INCOME_GROUP YEAR
1:  FRANCE    MAN     EURO         HIGH 2014

Also, you need to wrap quotes around FRANCE and LOW in your statement, and since it's a data.table , you don't need to use the dollar sign for identifying the columns.

Your code doesn't replicate the error, running your code the error is:

Error in `[.data.frame`(data, data$YEAR == "2014" & data$CURRENCY == "DOLLAR" &  : 

object 'FRANCE' not found

This is because you are trying to call a variable called FRANCE (and another called LOW) when you should be passing a character vector, like you do with "DOLLAR" :

datanew <- data[data$YEAR == "2014"& data$CURRENCY == "DOLLAR" & data$COUNTRY == "FRANCE" & data$INCOME_GROUP == "LOW"]

This replicates your problem, data frame with 0 columns and 5 rows which is just that you have no rows that satisfy all conditions - you subset to no data. You can have as many conditions as you like, but you need data that satisfy them. The following returns one row:

data[data$YEAR == "2014"& data$CURRENCY == "EURO" & data$COUNTRY == "FRANCE" & data$INCOME_GROUP == "HIGH"]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM