简体   繁体   中英

How to read chinese in rstudio on Linux

I encountered an issue when read the chinese file on Linux system by rstudio.

The error as below.

dt <- read.csv(file = "/home/..../aa-0912.csv", header = T , sep=",")

Error in make.names(col.names, unique = TRUE) : 
  invalid multibyte string at '<be><ba><b5><c3><c8><cb>'

This csv file is written by rstudio on Window system w/o specified encoding, as below:

write.csv(file = "/home/.../aa-0912.csv", data)

And I can read correctly on window but when I copy this file on my Linux system the read.csv doesn't work.

The locale on Linux is :

Sys.getlocale()

[1] "LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C"

The locale on Window is :
LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252

I am trying to read data by encoding="utf-8" but I got the similar error message.

Any help?

I'm not sure that this is the answer to your question.

I'll try to be as general as possible so that people having trouble in any language might have a solution:

First in the terminal local -a local would display all the available locales on your system.

Once you found the locale the right locale then on RStudio:

Sys.setlocale("LC_ALL","fr_FR.utf8") 

Sorry I don't seem to have any Chinese locale on my system. Other people have had the same issues: here and here

have also a look at ?Sys.setlocale in R.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM