简体   繁体   中英

R: copy string to column if contained in another column

I have a database (structured according to dplyr principle) giving an overview of a literature database. One of the columns is "language", and another one is "tag", a deprecated column I'd like to clean up as it contains multiple information. It also contains "language" information.

Each book entry has its language in that "tag" entry (along with other information, separated by commas). How can I copy each of these language strings contained in "tag" to the respective language column (currently empty).

Ie, how can I do "if tag column contains string "English" then move "English" to column "language"?

db<-data.frame(tags=c("Moose, English", "Feet, French"), language=NA)

db$language<-ifelse(grepl("English", db$tags), "English", db$language)
db$language<-ifelse(grepl("French", db$tags), "French", db$language)

This has the disadvantage of requiring you to know all the possible languages in the tags columns. You might want to run this when you're done to identify any left over languages:

db$tags[is.na(db$language)]

This will give you the tags from all the cases where no language was assigned yet.

UPDATE: A slightly simplified version will use a for loop through a vector with all the language names:

languages<-c("English","French","Spanish"[...])
for (i in 1:length(languages)) {
  db$language<-ifelse(grepl(languages[i], db$tags), languages[i], db$language)
}

Both options return

            tags language
1 Moose, English  English
2   Feet, French   French

You can use grepl to check if a string contains something.

For example:

tag <- c('Containing English Information', 'Containing unknown information')
testdata <- data.frame(tag)
testdata$language <- ifelse(grepl("English", testdata$tag), "English", NA)
testdata

returns

                             tag language
1 Containing English Information  English
2 Containing unknown information     <NA>

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM