Windows和Mac上的UTF-8编码使字符混乱

Question

I want to use my Windows (7 64bit) machine to obtain data from the Itunes API and process this data on my Mac (64bit El Capitan). 我想使用Windows（7 64位）计算机从Itunes API获取数据并在Mac（64位El Capitan）上处理该数据。 I am using the RJSONIO package to extract the names of the applications, they are from different countries in different languages. 我正在使用RJSONIO包提取应用程序的名称，它们来自不同国家的不同语言。 I attached a minimal examples with a few applications only. 我仅附带了一些仅包含少数应用程序的示例。 My preferred encoding is UTF-8. 我首选的编码是UTF-8。

library(RJSONIO)

getall<-function(ID){
u<-ID
lapply(X = u, function(u){
    dat <- fromJSON(u, encoding = "UTF-8")
    Name<-try(dat$results[[1]]$trackName)
    Artistname<-try(dat$results[[1]]$artistName)
    Seller<-try(dat$results[[1]]$sellerName)
    results<-return(list(Name, Artistname,Seller))
    })
}

apps1<-c("https://itunes.apple.com/lookup?id=335549244", "https://itunes.apple.com/lookup?id=362032276", "https://itunes.apple.com/lookup?id=353410020", "https://itunes.apple.com/lookup?id=350146139","https://itunes.apple.com/lookup?id=358942449", "https://itunes.apple.com/lookup?id=359871187")
    system.time(itunesNew<-data.frame(matrix(unlist(getall(ID = apps1), use.names = FALSE), nrow = length(apps1), ncol = 3, byrow = TRUE),stringsAsFactors=FALSE, byrow=T))
    colnames(itunesNew)<-c("Name", "Artistname","Seller")
    itunesnew2<-cbind(apps1, itunesNew)

I am using R with R Studio (both the most recent versions) and set standard encoding to UTF-8 in the global options. 我将R和R Studio（均为最新版本）一起使用，并在全局选项中将标准编码设置为UTF-8。 I was not able to set my locale to UTF-8 using 我无法使用以下方式将语言环境设置为UTF-8：

Sys.setlocale("LC_MESSAGES", 'en_GB.UTF-8')

or other versions in R. I also tried to download the data in "latin1" (it looks alright then on the PC), but messed up on the mac (setting encoding to latin1 in R Studio.). 或R中的其他版本。我还尝试在“ latin1”中下载数据（在PC上看起来还不错），但在Mac上却搞砸了（在R Studio中将编码设置为latin1）。

Questions : 问题：

Is there a way to work with the data on both machines using UTF-8? 有没有办法使用UTF-8在两台计算机上处理数据？
Are there other options to work on both machines? 在这两台机器上还有其他选择可以使用吗？
More general: is UTF-8 the encoding one should prefer for data like this? 更笼统：对于这样的数据，UTF-8是否应该首选编码？

Answer 1

I don't have my Windows VM handy but try this (it uses jsonlite & dplyr on both your systems to see if it helps (I ran it on OS X): 我没有Windows VM，但可以尝试一下（它在两个系统上都使用jsonlite和dplyr来查看是否有帮助（我在OS X上运行了它）：

library(jsonlite)
library(dplyr)

"%||%" <- function(a, b) { if (!is.null(a)) a else b }

apps <- c("https://itunes.apple.com/lookup?id=335549244", 
          "https://itunes.apple.com/lookup?id=362032276", 
          "https://itunes.apple.com/lookup?id=353410020", 
          "https://itunes.apple.com/lookup?id=350146139",
          "https://itunes.apple.com/lookup?id=358942449", 
          "https://itunes.apple.com/lookup?id=359871187")

bind_rows(lapply(apps, function(x) {
  res <- jsonlite::fromJSON(x, flatten=TRUE)$results
  data_frame(name=res$trackName %||% NA,
             artist_name=res$sellerName %||% NA,
             seller=res$sellerName %||% NA)
})) -> dat

glimpse(dat)

## Observations: 6
## Variables: 3
## $ name        (chr) "A+ the Waverley Novels Collection (15Books)", "A+ 中國養生寶典[卷一]", "...
## $ artist_name (chr) "rice mi", "CHEUNG PUI MAN", "CHEUNG PUI MAN", "CHEUNG PUI MAN", ...
## $ seller      (chr) "rice mi", "CHEUNG PUI MAN", "CHEUNG PUI MAN", "CHEUNG PUI MAN", ...

Windows和Mac上的UTF-8编码使字符混乱

问题描述

1 个解决方案

解决方案1
0 已采纳 2015-10-16 16:14:52

Windows和Mac上的UTF-8编码使字符混乱

问题描述

1 个解决方案

解决方案1 0 已采纳 2015-10-16 16:14:52

解决方案1
0 已采纳 2015-10-16 16:14:52