简体   繁体   中英

filter on a value in a column and remove rows that dont meet condition

I'm relatively new in R. I have this problem. I have data of dogs example of useful part of data (columns age_month and rasnaam (breed) are used)

I have to look for all the breeds if they are small, medium, large etc. And if they are a small breed then all the rows where age_month is lower than 9 have to be removed, if they are a medium sized breed rows where age_month is lower than 13 have to be removed, (large, age_month < 24). I've tried some things but it won't work. I've added all dogs to a list (also tried it with vector) like this: (only for small dogs here)

small_dogs <- list("Affenpinscher", "Bichon frisé", "Bolognezer", "Chihuahua, langhaar",
            "Dandie Dinmont Terrier", "Dwergkeeshond", "Japanse Spaniel",
            "Griffon belge", "Griffon bruxellois", "Kleine Keeshond", 
            "Lhasa Apso", "Maltezer", "Mopshond", "Pekingees", "Petit Brabançon",
            "Shih Tzu", "Tibetaanse Spaniel", "Volpino Italiano", "Yorkshire Terrier")

I tried this:

for (i in 1:nrow(brachquest2)){
     ifelse((brachquest2$rasnaam %in% small_dogs), (brachquest2 <- brachquest2[!(brachquest2$age_month < 9), ]), 
     ifelse((brachquest2$rasnaam %in% medium_dogs)), (brachquest2 <- brachquest2[!(brachquest2$age_month < 13), ]), 
     (brachquest2 <- brachquest2[!(brachquest2$age_month < 24), ]))
            }

But then I get an unused arguments error. Then I tried to use case_when(), but I'm not familiar with this function, so maybe I'm using it awfully wrong:

brachquest2 <- case_when(
  brachquest2$rasnaam %in% small_dogs ~ brachquest2[!(brachquest2$age_month < 11), ],
  brachquest2$rasnaam %in% medium_dogs ~ brachquest2[!(brachquest2$age_month < 13), ]
  )

Then I get an error: must be length 66 or one, not 18.

(the number of rows is 66)

I hope I explained it okay. Does someone have some useful tips for me? Or maybe it could be much simpler, every help is appreciated!! Thanks in advance

Below is dput of only age_month and rasnaam in reaction to neilfws. I don't know for sure if this is the right way

structure(list(age_month = structure(c(50, 52, 52.1, 49.7, 49.7, 
49.6, 49.6, 49.6, 49.5, 50, 48.8, 52.1, 51.9, 48.7, 50, 50.2, 
50.4, 50.5, 49, 49, 49, 49, 49, 48.9, 15, 17.6, 17.6, 17.6, 17.6, 
16.3, 17.6, 17.6, 15, 15.8, 16, 16.2, 17.5, 14.9, 10.4, 10.2, 
10.5, 10.4, 10.3, 10.3, 10.2, 10.3, 10.3, 10.3, 12.8, 12.8, 12.8, 
12.8, 12.8, 10, 10.4, 10.2, 10.3, 10.3, 12.7, 12.7, 13.2, 13.2, 
13.1, 13.1, 12.7, 12.7), units = "days", class = "difftime"), 
    rasnaam = c("American Staffordshire Terrier", "Boxer", "Bull Terrier", 
    "Chihuahua, langhaar", "Chihuahua, langhaar", "Chihuahua, langhaar", 
    "Chihuahua, langhaar", "Chihuahua, langhaar", "Chihuahua, langhaar", 
    "Chihuahua, langhaar", "Franse Bulldog", "Franse Bulldog", 
    "Labrador Retriever", "Shih Tzu", "American Staffordshire Terrier", 
    "American Staffordshire Terrier", "American Staffordshire Terrier", 
    "American Staffordshire Terrier", "American Staffordshire Terrier", 
    "American Staffordshire Terrier", "American Staffordshire Terrier", 
    "American Staffordshire Terrier", "American Staffordshire Terrier", 
    "American Staffordshire Terrier", "American Staffordshire Terrier", 
    "Boxer", "Boxer", "Boxer", "Boxer", "Boxer", "Bull Terrier", 
    "Bull Terrier", "Chihuahua, langhaar", "Chihuahua, langhaar", 
    "Chihuahua, langhaar", "Chihuahua, langhaar", "Chihuahua, langhaar", 
    "Franse Bulldog", "Franse Bulldog", "Franse Bulldog", "Franse Bulldog", 
    "Franse Bulldog", "Labrador Retriever", "Labrador Retriever", 
    "Labrador Retriever", "Labrador Retriever", "Labrador Retriever", 
    "Labrador Retriever", "Labrador Retriever", "Labrador Retriever", 
    "Labrador Retriever", "Labrador Retriever", "Labrador Retriever", 
    "Shih Tzu", "Shih Tzu", "Shih Tzu", "Shih Tzu", "Shih Tzu", 
    "American Staffordshire Terrier", "Boxer", "Franse Bulldog", 
    "Franse Bulldog", "Shih Tzu", "Shih Tzu", "American Staffordshire Terrier", 
    "Boxer")), row.names = c(NA, -66L), class = "data.frame")

If you want to stick with using case_when , this is one way to achieve what you're looking for:

library(dplyr)

brachquest2 %>%
  mutate(
    # Create a temp var, removal_status, to label what rows should be kept or removed
    removal_status = case_when(
      (rasnaam %in% small_dogs) & age_month < 9 ~ "Remove",
      (rasnaam %in% medium_dogs) & age_month < 13 ~ "Remove",
      (rasnaam %in% large_dogs) & age_month < 24 ~ "Remove",
      TRUE ~ "Keep"
    )) %>% 
  # Keep only what's labelled "Keep"
  filter(removal_status == "Keep") %>% 
  # Remove temp var
  select(-removal_status)

Using the list of small_dogs you provided and creating my own list of medium_dogs which only has one value in it, boxer, I got the following (2 boxers under age_month 13 were removed):

#   age_month                        rasnaam
# 1  50.0 days American Staffordshire Terrier
# 2  52.0 days                          Boxer
# 3  52.1 days                   Bull Terrier
# 4  49.7 days            Chihuahua, langhaar
# 5  49.7 days            Chihuahua, langhaar
# 6  49.6 days            Chihuahua, langhaar
# 7  49.6 days            Chihuahua, langhaar
# 8  49.6 days            Chihuahua, langhaar
# 9  49.5 days            Chihuahua, langhaar
# 10 50.0 days            Chihuahua, langhaar
# 11 48.8 days                 Franse Bulldog
# 12 52.1 days                 Franse Bulldog
# 13 51.9 days             Labrador Retriever
# 14 48.7 days                       Shih Tzu
# 15 50.0 days American Staffordshire Terrier
# 16 50.2 days American Staffordshire Terrier
# 17 50.4 days American Staffordshire Terrier
# 18 50.5 days American Staffordshire Terrier
# 19 49.0 days American Staffordshire Terrier
# 20 49.0 days American Staffordshire Terrier
# 21 49.0 days American Staffordshire Terrier
# 22 49.0 days American Staffordshire Terrier
# 23 49.0 days American Staffordshire Terrier
# 24 48.9 days American Staffordshire Terrier
# 25 15.0 days American Staffordshire Terrier
# 26 17.6 days                          Boxer
# 27 17.6 days                          Boxer
# 28 17.6 days                          Boxer
# 29 17.6 days                          Boxer
# 30 16.3 days                          Boxer
# 31 17.6 days                   Bull Terrier
# 32 17.6 days                   Bull Terrier
# 33 15.0 days            Chihuahua, langhaar
# 34 15.8 days            Chihuahua, langhaar
# 35 16.0 days            Chihuahua, langhaar
# 36 16.2 days            Chihuahua, langhaar
# 37 17.5 days            Chihuahua, langhaar
# 38 14.9 days                 Franse Bulldog
# 39 10.4 days                 Franse Bulldog
# 40 10.2 days                 Franse Bulldog
# 41 10.5 days                 Franse Bulldog
# 42 10.4 days                 Franse Bulldog
# 43 10.3 days             Labrador Retriever
# 44 10.3 days             Labrador Retriever
# 45 10.2 days             Labrador Retriever
# 46 10.3 days             Labrador Retriever
# 47 10.3 days             Labrador Retriever
# 48 10.3 days             Labrador Retriever
# 49 12.8 days             Labrador Retriever
# 50 12.8 days             Labrador Retriever
# 51 12.8 days             Labrador Retriever
# 52 12.8 days             Labrador Retriever
# 53 12.8 days             Labrador Retriever
# 54 10.0 days                       Shih Tzu
# 55 10.4 days                       Shih Tzu
# 56 10.2 days                       Shih Tzu
# 57 10.3 days                       Shih Tzu
# 58 10.3 days                       Shih Tzu
# 59 12.7 days American Staffordshire Terrier
# 60 13.2 days                 Franse Bulldog
# 61 13.2 days                 Franse Bulldog
# 62 13.1 days                       Shih Tzu
# 63 13.1 days                       Shih Tzu
# 64 12.7 days American Staffordshire Terrier

Adjust the lists and the age_month conditions as you see fit.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM