簡體   English   中英

刪除 R 中列名中的重音

[英]Removing accents in column names in R

我幾乎嘗試了該網站中的所有解決方案,因此我開始認為這可能是 excel 文件的問題。 無論如何,我有多個 xlsx 文件,其中的工作表已合並為一個 dataframe(使用 map_df)。 不幸的是,名字是西班牙語,隨着代碼的進行,它會產生 R 的問題。 重音名稱僅在列名稱中,關於重音名稱時如何解決此問題的任何提示或建議? 不確定它是否來自 xlsx 文件是我嘗試過的代碼不起作用的原因。 謝謝你。

根據要求輸入數據樣本:

structure(list(file = c("location1/location2/namelocationfile1.xlsx", 
"location1/location2/namelocationfile2.xlsx", 
"location1/location2/namelocationfile3.xlsx", 
"location1/location2/namelocationfile4.xlsx", 
"location1/location2/namelocationfile5.xlsx", 
"location1/location2/namelocationfile6.xlsx"
), sheet = c("TOTAL-2015 ", "TOTAL-2015 ", "TOTAL-2015 ", "TOTAL-2015 ", 
"TOTAL-2015 ", "TOTAL-2015 "), age = c("Total", "0-4", "0", 
"1", "2", "3"), total = c("355461", "35173", "7091", "7042", 
"7027", "7008"), plán = c("126131", "11698", "2407", "2318", 
"2349", "2282"), pláns = c("8456", "726", "162", "135", "133", 
"138"), place = c("35112", "2969", "599", "607", "555", 
"597"), concepción = c("12912", "1283", "281", "263", 
"244", "253"), refugio = c("10959", "903", "174", "174", "206", 
"184"), lugar = c("20733", "2229", "431", "454", "409", "486"
), san_marco = c("31082", "3271", "624", "658", "670", "656"), 
    menéndez = c("47495", "5070", "990", "1023", 
    "1008", "1020"), san = c("10244", "955", "193", "203", 
    "189", "194"), san_pedro = c("8374", "915", "183", 
    "181", "205", "175"), buenosaires = c("33242", "4244", "862", 
    "857", "894", "836"), turín = c("10721", "910", "185", "169", 
    "165", "187")), row.names = c(NA, -6L), class = c("tbl_df", 
"tbl", "data.frame"))

這是iconv的一種可能解決方案:

string <- c("à", "è", "ì"," ò"," ù", "À", "È", "Ì", "Ò", "Ù")

# [1] "à"  "è"  "ì"  " ò" " ù" "À"  "È"  "Ì"  "Ò"  "Ù" 


gsub("`", "", iconv(string, from = "UTF-8", , to='ASCII//TRANSLIT'))

# [1] "a"  "e"  "i"  " o" " u" "A"  "E"  "I"  "O"  "U" 

stringi的另一個選項不需要gsub

library(stringi)
stri_trans_general(str = string, id = "Latin-ASCII")

# [1] "a"  "e"  "i"  " o" " u" "A"  "E"  "I"  "O"  "U" 

更新

要使用rename_with將 function 應用於列名,我們需要在iconv中使用.x 此外,對於gsub ,模式是'而不是“`”。

library(tidyverse)

df_new <- df %>% 
    rename_with(., ~ gsub("'", "", iconv(.x, from = "UTF-8", to='ASCII//TRANSLIT')))

# Or we can use `stringr` instead of `gsub`:
# df %>% 
#    rename_with(., ~ str_replace_all(iconv(.x, to='ASCII//TRANSLIT'), "'", ""))

colnames(df_new)
# [1] "file"        "sheet"       "age"         "total"       "plan"        "plans"       "place"       "concepcion"  "refugio"    
# [10] "lugar"       "san_marco"   "menendez"    "san"         "san_pedro"   "buenosaires" "turin"

基地 R 選項:

colnames(df) <- gsub("'", "", iconv(colnames(df), from = "UTF-8", to='ASCII//TRANSLIT'))

或者:

colnames(df) <- stri_trans_general(str = colnames(df), id = "Latin-ASCII")

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM