简体   繁体   English

无法将一些unicode字符写入文件

[英]Can't write some unicode characters to file

Let's consider the following code: 让我们考虑以下代码:

> cat('\u2077\u2078\u2079 \u2087\u2088\u2089')
⁷⁸⁹ ₇₈₉
> out <- file("out.txt", "w", encoding = 'utf-8')
> cat('\u2077\u2078\u2079 \u2087\u2088\u2089', file=out)
> close(out)

the content of out.txt is: out.txt的内容是:

78<U+2079> 789

The sub/superscript form is lost and for exponent 9 it's the codepoint that is printed. 子/上标格式丢失,并且指数9是打印的代码点。

What's happening here? 这里发生了什么事? How can I have the correct form of the characters in the file as they are printed in the RStudio console? 在RStudio控制台中打印字符时,如何在文件中使用正确的字符形式?

Versions: RStudio 1.1.436 / R 3.5.2 / Windows 10 版本:RStudio 1.1.436 / R 3.5.2 / Windows 10

Aargh, windows and UTF-8! Aargh,Windows和UTF-8!

I've been puzzling as well, and this works for me 我也一直感到困惑,这对我有用

options(encoding='native.enc')
out <- file('out.txt', open='w', encoding = 'UTF-8')
writeLines('\u2077\u2078\u2079 \u2087\u2088\u2089', 'out.txt', useBytes = TRUE)
close(out)
readback <- readLines('out.txt', encoding='UTF-8')

My setup is a bit older (my most used setup is OSX): Rstudio 0.99.903/R 3.3.1/Windows 7 我的设置有点旧(我最常用的设置是OSX):Rstudio 0.99.903 / R 3.3.1 / Windows 7

The very strangest thing I've encountered is that it stops working if you set options(encoding='UTF-8') 我遇到的最奇怪的事情是,如果您设置了options(encoding='UTF-8') ,它就会停止工作options(encoding='UTF-8')

And finally, I noticed all mentions of UTF-8 are in uppercase, I see you used lowercase, I'm not sure if that makes a difference. 最后,我注意到所有提到的UTF-8都是大写的,我看到您使用的是小写的,我不确定这是否有所不同。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM