简体   繁体   中英

R: String match from column1 selects data from column2's rows to create column3?

I am trying to get a string match from column1 and then select only that data from the corresponding rows of column2 in order to create a column3 with the data from those string-matched rows of column2 .

I hope this is clear.

Example: Partial string "dog"

DF
#   Column1      column2  column3
#1    doggy            x        x
#2      cat            y       
#3     bird            y
#4    doggy            z        z
#5      cat            x
#6     bird            y

Thank you!

We can accomplish this with the dplyr and stringr packages.

Use mutate to create the new column3 variable.

case_when allows you to vectorise if_else() . It is a two-sided formula that uses str_detect to detect the presence of the provided pattern in Column1 . If the pattern is present, the value in column2 is returned in column3 . If the pattern is not present, no value is returned (signified by the TRUE ~ "" portion.

Thanks for the data, Ronak!

df <- structure(list(Column1 = c("doggy", "cat", "bird", "doggy", "cat", 
"bird"), column2 = c("x", "y", "y", "z", "x", "y")), 
class = "data.frame", row.names = c(NA, -6L))


library(dplyr)
library(stringr)

df %>% 
  mutate(
    column3 = case_when(
      str_detect(Column1, "dog") ~ column2,
      TRUE ~ ""
    )
  )

#>   Column1 column2 column3
#> 1   doggy       x       x
#> 2     cat       y        
#> 3    bird       y        
#> 4   doggy       z       z
#> 5     cat       x        
#> 6    bird       y

Created on 2021-03-11 by the reprex package (v0.3.0)

We can use ifelse with grepl :

transform(df, column3 = ifelse(grepl('dog', Column1), column2, ''))

#  Column1 column2 column3
#1   doggy       x       x
#2     cat       y        
#3    bird       y        
#4   doggy       z       z
#5     cat       x        
#6    bird       y        

data

df <- structure(list(Column1 = c("doggy", "cat", "bird", "doggy", "cat", 
"bird"), column2 = c("x", "y", "y", "z", "x", "y")), 
class = "data.frame", row.names = c(NA, -6L))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM