[英]Create dictionary and replace by it latin words in R
I have dataset with latin words 我有带有拉丁语单词的数据集
text<-c("TESS",
"MAG")
I want to set transliteration from latin-cyrillic 我想设置拉丁西里尔字母的音译
library(stringi)
d=stri_trans_general(mydat$text, "latin-cyrillic")
But I want to manually create the translit dictionary. 但是我想手动创建翻译字典。 For example:
例如:
dictionary<-c("Tess"="ТЕСС"
"MAG"="МАГ"
.......
......
)
when dictionary is created, in mydat$text,all latin words must be replaced by cyrillic words, which i set. 创建字典时,在mydat $ text中,所有拉丁词都必须替换为我设置的西里尔字母。 something like this
像这样的东西
d=dictionary(mydat$text)
How perform such replacing? 如何进行这种替换?
text<-c("TESS",
"MAG")
dict=path.csv
it containt 它包含
dict=
structure(list(old = structure(c(2L, 1L), .Label = c("mag", "tess"
), class = "factor"), new = structure(c(2L, 1L), .Label = c("маг",
"тесс"), class = "factor")), .Names = c("old", "new"), class = "data.frame", row.names = c(NA,
-2L))
#output #output
text<-c("ТЕСС",
"МАГ")
that's all 就这样
There you go! 你去!
dict <- structure(list(
old = structure(c(2L, 1L), .Label = c("mag", "tess"),class = "factor"),
new = structure(c(2L, 1L), .Label = c("маг", "тесс"), class = "factor")),
.Names = c("old", "new"), class = "data.frame", row.names = c(NA, -2L))
input<-c("TESS","MAG")
output <- with(lapply(dict,as.character), new[match(tolower(input),old)])
output
# [1] "тесс" "маг"
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.