简体   繁体   中英

R how to read or replace Spanish special characters

I want to use data from the data set here . It is from a data set in Spanish, from Peru I think. It can be downloaded in several formats but they all seem to have the same problem. Here's an example of the problem - maÌ_z . This should be maíz . My first thought was that there a font encoding problem. But I have tried several font encoding choices that are sometimes used for Spanish language documents (eg, UTF-8, WINDOWS-1252, ISO-8859-1) using the RStudio Reopen with Encoding option. The character representation changes for some of them but not to the appropriate í . Some other examples Cimarr?_n , c??scara , m??shka . I think I can do a search and replace but would prefer to find an encoding fix.

Have you try to use directly the encoding argument in the read() function? Here is an example :

dt <- read.csv("dt", header = TRUE, sep = ",", dec = ".",
                     comment.char = "", strip.white = TRUE,
                     stringsAsFactors = TRUE, encoding="UTF-8")

When I use french data I have to do it this way.

It is possible the orignal file was not encoded in UTF-8, so you may have too encode it before reading it.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM