简体   繁体   中英

remove characters from column names

I try to remove in R, some characters unwanted from my column names (numbers, . and space) I have column names as follows

My data is tibble

tibble [33 x 38] (S3: tbl_df/tbl/data.frame) $ year: chr [1:33] "1988" "1989" "1990" "1991"... $ VALOR AGREGADO BRUTO (a precios básicos): num [1:33] 9906283 11624212 14163419 17400488 19785184... $ 1. PRODUCTOS AGRÍCOLAS NO INDUSTRIALES: num [1:33] 831291 911372 1112167 1434213 1532067... $ 2. PRODUCTOS AGRÍCOLAS INDUSTRIALES: num [1:33] 143426 214369 231168 341144 282777... $ 3. COCA: num [1:33] 118273 153689 195108 190264 199259...

And I desired column names were.

tibble [33 x 38] (S3: tbl_df/tbl/data.frame) $ year: chr [1:33] "1988" "1989" "1990" "1991"... $ VALOR AGREGADO BRUTO (a precios básicos): num [1:33] 9906283 11624212 14163419 17400488 19785184... $ PRODUCTOS AGRÍCOLAS NO INDUSTRIALES: num [1:33] 831291 911372 1112167 1434213 1532067... $ PRODUCTOS AGRÍCOLAS INDUSTRIALES: num [1:33] 143426 214369 231168 341144 282777... $ COCA: num [1:33] 118273 153689 195108 190264 199259...

I want remove number and. from colnames

colnames(data) <- sub("\\1:4\.\\", "", colnames(data))
colnames(data)

Please somebody could help me?

Best! Marcelo

It's not clear what was wrong with the answers you got, but here's another try. Since you're showing a data.frame and want to rename the columns, you can use the str_replace() inside dplyr::rename_with() . Also, since your data has 38 columns, I'm guessing you may need to remove numbers other than just 1-4 . To accommodate that I opened the range to all numbers by including [0-9] and allowed either 1 or 2 digit numbers by indicating {1,2} after the numeral specification.

library(tidyverse)

# took the column names you showed and added one with a higher number
d <- tibble(year = 1:5,
       "VALOR  AGREGADO  BRUTO (a  precios  básicos)" = 1:5,
       "1. PRODUCTOS AGRÍCOLAS NO INDUSTRIALES" = 1:5,
       "2. PRODUCTOS AGRÍCOLAS INDUSTRIALES" = 1:5,
       "3. COCA" = 1:5,
       "29. OTHER" = 1:5)

# rename_with takes a renaming function
d %>% 
  rename_with(~str_remove(.x, "[0-9]{1,2}. "))
#> # A tibble: 5 x 6
#>    year `VALOR  AGREGADO  BRUTO ~` `PRODUCTOS AGR~` `PRODUCTOS AGR~`  COCA OTHER
#>   <int>                      <int>            <int>            <int> <int> <int>
#> 1     1                          1                1                1     1     1
#> 2     2                          2                2                2     2     2
#> 3     3                          3                3                3     3     3
#> 4     4                          4                4                4     4     4
#> 5     5                          5                5                5     5     5

Created on 2022-02-17 by the reprex package (v2.0.1)

We can use this pattern that reads, replace if it starts with one or more digit followed by a dot and a space.

library(stringr)

data <- c("1. PRODUCTOS AGRÍCOLAS NO INDUSTRIALES",
"2. PRODUCTOS AGRÍCOLAS INDUSTRIALES",
"3. SILVICULTURA, CAZA Y PESCA",
"4. PRODUCTOS PECUARIOS") 
  
str_replace(data, '^\\d+\\. ', "")
#> [1] "PRODUCTOS AGRÍCOLAS NO INDUSTRIALES" "PRODUCTOS AGRÍCOLAS INDUSTRIALES"   
#> [3] "SILVICULTURA, CAZA Y PESCA"          "PRODUCTOS PECUARIOS"

Created on 2022-02-16 by the reprex package (v2.0.1)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM