简体   繁体   中英

How can I switch the positions of the first and last element of a character string?

I have a dataframe with a character column with names in the following format: "Lastname Middlename Title" . I need to swap "Lastname" and "Title" and it varies how many middle names there are for each row.

Examples of input:

Doe John Mr. 
Smith John Doe Mr.

Desired output:

Mr. John Doe 
Mr. John Doe Smith

You can do it with sub and backreferences. Using data x <- c("Doe John Mr.", "Smith John Doe Mr.") :

sub("^(\\w+)( .* )(\\w+\\.?)$", "\\3\\2\\1", x)

#### OUTPUT ####

[1] "Mr. John Doe"       "Mr. John Doe Smith"

This captures three groups: 1) the first word in the string ^(\\\\w+) , 2) everything between the first word and the last word ( .* ) , and 3) the last word in the string with 0 or 1 periods (\\\\w+\\\\.?)$ . It then swaps groups 1 and 3 while leaving 2 where it is.

We may use strplit .

str1 <- "Doe John Mr." 
str2 <- "Smith John Doe Mr."

Reduce(paste, el(strsplit(str1, " "))[3:1])
# [1] "Mr. John Doe"

Reduce(paste, el(strsplit(str2, " "))[c(4, 2, 3, 1)])
# [1] "Mr. John Doe Smith"

I used tokenizer to split up the input string and then go in reverse order. I noticed your example is in reverse order so that's what I'm working off of. If you have other examples where they're not in reverse order, all you have to do is arrange them in the order that you need.

library(tokenizers)
string <- "Doe John Mr. Smith Doe John Mr."
y <- tokenize_words(string, strip_punct = TRUE, simplify = TRUE)  
rev(y)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM