简体   繁体   中英

Combine grep and factor from different columns in selecting R dataframe rows

I am am changing values in the column df$AccPat in a dataframe based on values in other cells. Starting point:

      AccVs               Verb              Acc      LVs AccPat
0         2            pádsáda               fa        1      u
1         1     pácccaiiidncll               ma        2      u
2         0                saa               un        1      u
3         0               liss               un        0      u
4         1           litátoko               fa        0      u
5         1           wupágaak               ma        1      u

I can combine multiple factors, thus:

df[df$Acc == "fa" & df$LVs == "0",]$AccPat <- "a"

      AccVs               Verb              Acc      LVs AccPat
0         2            pádsáda               fa        1      u
1         1     pácccaiiidncll               ma        2      u
2         0                saa               un        1      u
3         0               liss               un        0      u
4         1           litátoko               fa        0      a
5         1           wupágaak               ma        1      u

or I can use grep to choose rows which match a regular expression in one column:

df[grep("^pá", df$Verb),]$AccPat <- "p"

      AccVs               Verb              Acc      LVs AccPat
0         2            pádsáda               fa        1      p
1         1     pácccaiiidncll               ma        2      p
2         0                saa               un        1      u
3         0               liss               un        0      u
4         1           litátoko               fa        0      a
5         1           wupágaak               ma        1      p

but I would like to do both at the same time, so only choose rows matching the regular expression above that have a value of "1" in df$AccVs.

      AccVs               Verb              Acc      LVs AccPat
0         2            pádsáda               fa        1      u
1         1     pácccaiiidncll               ma        2      b
2         0                saa               un        1      u
3         0               liss               un        0      u
4         1           litátoko               fa        0      a
5         1           wupágaak               ma        1      u

I used to think this was impossible, but this question makes me think it isn't. However, the solution given there does not work for me.

df[grep("^pá", df$Verb) & df$AccVs == "1"]$AccPat <- "b" 

results in the errors "undefined columns selected" and "longer object length is not a multiple of shorter object length", and

df[grep("^pá", df$Verb) & df$AccVs == "1",]$AccPat <- "b" 

attempts to print my entire dataframe (which is much larger than this sample one), and also results in the error "longer object length is not a multiple of shorter object length".

Note: Many of the values I am checking for are strings, so I need a solution that works for strings. I'm not doing anything numeric, so it's fine if I treat the integers as stings.

Here's your data frame:

df <- data.frame(AccVs=c(2,1,0,0,1,1), Verb=c("pádsáda","pácccaiiidncll","saa","liss","litátoko","wupágaak"),
                 Acc=c("fa","ma","un","un","fa","ma"),
                 LVs=c(1,2,1,0,0,1),
                 AccPat=rep("u",6),
                 stringsAsFactors=F)

grep and == return different classes:

grep("^pá", df$Verb)
[1] 1 2

df$AccVs == "1"
[1] FALSE  TRUE FALSE FALSE  TRUE  TRUE

Use grepl to return logical

grepl("^pá", df$Verb)
[1] TRUE TRUE FALSE FALSE FALSE FALSE

The result:

  AccVs           Verb Acc LVs AccPat
1     2        pádsáda  fa   1      u
2     1 pácccaiiidncll  ma   2      b
3     0            saa  un   1      u
4     0           liss  un   0      u
5     1       litátoko  fa   0      u
6     1       wupágaak  ma   1      u

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM