简体   繁体   中英

removing numbers and characters from column names r

I'm trying to remove specific numbers and characters from the column names in a data frame in R but am only able to remove the numbers, have tried different manners but still keep the characters at the end.

Each column is represented as letters and then a number in parenthesis; eg ASE (232)

DataFrame

Subject ASE (232) ASD (121) AFD (313)
   1        1.1.     1.2     1.3

Desired Data Frame

Subject ASE ASD AFD
   1    1.1 1.2 1.3

Code

colnames(data)<-gsub("[A-Z] ([0-9]+)","",colnames(data))

We may change the code to match one or more space ( \\s+ ) followed by the opening parentheses ( \\( , one or more digits ( \\d+ ) and other characters ( .* ) and replace with blank ( "" )

colnames(data) <- sub("\\s+\\(\\d+.*", "", colnames(data))
colnames(data)
[1] "Subject" "ASE"     "ASD"     "AFD"    

Or another option is trimws from base R

trimws(colnames(data), whitespace = "\\s+\\(.*")
[1] "Subject" "ASE"     "ASD"     "AFD"    

In the OP's, code, it is matching an upper case letter followed by space and the ( is a metacharacter, which is not escaped. , thus in regex mode, it captures the digits ( ([0-9]+) ). But, this don't match the pattern in the column names, because after a space, there is a ( , which is not matched, thus it returns the same string

gsub("[A-Z] ([0-9]+)","",colnames(data))
[1] "Subject"   "ASE (232)" "ASD (121)" "AFD (313)"

data

data <- structure(list(Subject = 1L, `ASE (232)` = "1.1.", `ASD (121)` = 1.2, 
    `AFD (313)` = 1.3), class = "data.frame", row.names = c(NA, 
-1L))

You can do this:

sub("(\\w+).*", "\\1", colnames(data))

This uses backreference \\1 to "remember" any series of alphanumeric characters \\w+ and replaces the whole string in sub 's replacement argument with just that remembered bit.

We could use word from stringr package along with rename_with :

library(stringr)
library(dplyr)
data %>% 
  rename_with(~word(., 1))
  Subject  ASE ASD AFD
1       1 1.1. 1.2 1.3

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM