I got the data.frame df
below that have long variable names.
The first part of each name is the main category (rock, soil, land use) and the second part that is usually composed of several names is the levels (eg for rock, 2 levels are sandstone mudstone basalt chert limestone
and sandstone conglomerate coquina tephra
).
> df
# A tibble: 5 x 2
`rock_sandstone conglomerate coquina tephra` `rock_sandstone mudstone basalt chert limestone`
<dbl> <dbl>
1 0.000000 18.774037
2 41.968310 30.276509
3 32.804031 0.000000
4 8.669436 3.092062
5 32.937377 19.894776
I want to shorten the variable names by just using the first letter of each word to be like below. I can do that using for example dplyr::rename
. However, I have 97 variables and I want to do the same for 20 data.frames that have different variable names. I wonder if there is a faster way to that.
library(dplyr)
df <- df %>% rename("r_sccat" = 'rock_sandstone conglomerate coquina tephra',
"r_smbcl" = "rock_sandstone mudstone basalt chert limestone")
> df
# A tibble: 5 x 2
r_sccat r_smbcl
<dbl> <dbl>
1 0.000000 18.774037
2 41.968310 30.276509
3 32.804031 0.000000
4 8.669436 3.092062
5 32.937377 19.894776
DATA
> dput(df)
structure(list(`rock_sandstone conglomerate coquina tephra` = c(0,
41.9683095321332, 32.8040311360418, 8.66943642122745, 32.9373770476129
), `rock_sandstone mudstone basalt chert limestone` = c(18.7740373237074,
30.2765089609693, 0, 3.09206176664796, 19.8947759845006)), row.names = c(NA,
-5L), class = c("tbl_df", "tbl", "data.frame"), .Names = c("rock_sandstone conglomerate coquina tephra",
"rock_sandstone mudstone basalt chert limestone"))
A bit ugly, but abbreviate
and some regex replacement will get you there:
names(df) <- sub("^(.)", "\\1_", abbreviate(gsub("_", " ", names(df))))
df
## A tibble: 5 × 2
# r_scct r_smbcl
# <dbl> <dbl>
#1 0.000000 18.774037
#2 41.968310 30.276509
#3 32.804031 0.000000
#4 8.669436 3.092062
#5 32.937377 19.894776
I am not familiar with abbreviate, but the same can be achieved directly with a few regexp substitutions:
names( df ) <- gsub( ' ', "", gsub( "([a-z])([a-z]+)", "\\1", names( df ) ) )
using magrittr allows for a cleaner syntax:
require( magrittr )
names( df ) %<>%
gsub( "([a-z])([a-z]+)", "\\1", . ) %>%
gsub( " ", "", . )
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.