简体   繁体   中英

Extract only characters, including spaces, from a column

My data is structured as:

df <- data.frame(Athlete = c('02 Paul Jones', '02 Paul Jones', '02 Paul Jones', '02 Paul Jones',
                             '02 Paul Jones', '02 Paul Jones', '02 Paul Jones', '02 Paul Jones',
                             '01 Joe Smith', '01 Joe Smith', '01 Joe Smith', '01 Joe Smith',
                             '01 Joe Smith', '01 Joe Smith', '01 Joe Smith', '01 Joe Smith'),
                 Period = c('P1', 'P1', 'P1', 'P1',
                            'P2', 'P2', 'P2', 'P2',
                            'P1', 'P1', 'P1', 'P1',
                            'P2', 'P2', 'P2', 'P2'))
# Make `Athlete` column a character
df$Athlete <- as.character(df$Athlete)

How do I extract the first and last names of each athlete whilst keeping the space between first and last name? I do not want the leading space including either. For example, "Paul Jones" not " Paul Jones"

remove all except alphabets [:alpha:] and space characters [:space:] using POSIX locale type interpretation of regular expression pattern.

df$Athlete <- as.character(df$Athlete)  # convert factor to character

df$Athlete <- gsub("[^[:alpha:][:space:]]", '', df$Athlete) 
df$Athlete <- gsub("^[[:space:]]+", '', df$Athlete )  # removing leading spaces

head(df)
#       Athlete Period
# 1  Paul Jones     P1
# 2  Paul Jones     P1
# 3  Paul Jones     P1
# 4  Paul Jones     P1
# 5  Paul Jones     P2
# 6  Paul Jones     P2

We can use sub to match one or more numbers ( [0-9]+ ) followed by one or more space ( \\\\s+ ) from the start ( ^ ) of the string and replace it with ""

df$Athlete <- sub("^[0-9]+\\s+", "", df$Athlete)
df
#      Athlete Period
#1  Paul Jones     P1
#2  Paul Jones     P1
#3  Paul Jones     P1
#4  Paul Jones     P1
#5  Paul Jones     P2
#6  Paul Jones     P2
#7  Paul Jones     P2
#8  Paul Jones     P2
#9   Joe Smith     P1
#10  Joe Smith     P1
#11  Joe Smith     P1
#12  Joe Smith     P1
#13  Joe Smith     P2
#14  Joe Smith     P2
#15  Joe Smith     P2
#16  Joe Smith     P2

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM