简体   繁体   English

蓝天统计 - 字符编码问题

[英]BlueSky Statistics - Character Encoding Problem

I am loading a data set, characters of which was encoded in ISO 8859-9 ("Latin 5") using Windows 10 OS (Microsoft has assigned code page 28599 aka Windows-28599 to ISO-8859-9 in Windows).我正在加载一个数据集,其字符使用 Windows 10 操作系统以 ISO 8859-9(“Latin 5”)编码(Microsoft 已将代码页 28599 又名 Windows-28599 分配给 Windows 中的 ISO-8859-9)。 The data set is originally in Excel.该数据集最初在 Excel 中。

Whenever I run an analysis, or any operation with a variable name containing a character specific to this code page (ISO 8859-9), I get an error like:每当我运行分析或使用包含此代码页特定字符 (ISO 8859-9) 的变量名称的任何操作时,我都会收到如下错误:

Error: undefined columns selected
BSkyFreqResults <- BSkyFrequency(vars = c("MesleÄŸi"), data = Turnudep_raw_data_5)
Error: object 'BSkyFreqResults' not found
BSkyFormat(BSkyFreqResults)

The characters ÄŸ within "MesleÄŸi" are originally one character in Turkish (g with an inverted hat on) ğ "MesleÄŸi" 中的字符 ÄŸ 最初是土耳其语中的一个字符(g 带有倒帽子) ğ

Those variable names that contain only letters from US code page work normally in BlueSky operations.那些仅包含来自美国代码页的字母的变量名称在 BlueSky 操作中正常工作。

If I try to use save as in Excel and use web option UTF-8, to convert the data to UTF-8, this does not work either.如果我尝试在 Excel 中使用另存为并使用 web 选项 UTF-8,将数据转换为 ZAE3B3ZDF,9070B49B727 这也不起作用。 If I export it to csv file, it does not work as is, or saved as UTF-8.如果我将它导出到 csv 文件,它不能按原样工作,或者另存为 UTF-8。

How can I load this data into BlueSky so that it works?如何将这些数据加载到 BlueSky 中以使其正常工作?

This same data set works in Rstudio:同样的数据集适用于 Rstudio:

> Sys.getlocale('LC_CTYPE')
[1] "Turkish_Turkey.1254"

And also in SPSS: Language is set to Unicode Picture of Language settings in SPSS并且在 SPSS 中: Language is set to Unicode Picture of Language settings in SPSS

It also works in Jamovi它也适用于 Jamovi

I also get an error when I start BlueSky, that may be relevant to this problem:当我启动 BlueSky 时,我也会收到一个错误,这可能与这个问题有关:

Python-CFFI error Python-CFFI 错误

From cffi callback <function _consolewrite_ex at 0x000002A36B441F78>:
Traceback (most recent call last):
  File "rpy2\rinterface_lib\callbacks.py", line 132, in _consolewrite_ex
  File "rpy2\rinterface_lib\conversion.py", line 133, in _cchar_to_str_with_maxlen
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xfc in position 15: invalid start byte

Since then I re-downloaded and re-installed BlueSky, but I still get this Python-CFFI error every time I start the software.从那以后,我重新下载并重新安装了 BlueSky,但每次启动软件时我仍然会收到此 Python-CFFI 错误。

I want to work with BlueSky and will appreciate any help in resolving this problem.我想与 BlueSky 合作,并感谢您在解决此问题方面的任何帮助。 Thanks in advance提前致谢

Here is a link for reproducing the problem.这是重现问题的链接

The zip file contains a data source of 2 cases both in Excel and BlueSky format, a BlueSky Markdown file to show how the error is produced and an RMarkdown file for redundancy (probably useless). zip 文件包含 Excel 和 BlueSky 格式的 2 个案例的数据源,一个 BlueSky Markdown 文件显示如何产生错误和一个 RMarkdown 文件用于冗余

UPDATE: The Python error (Python-CFFI error) appears to be related to the Region settings in Windows.更新: Python 错误(Python-CFFI 错误)似乎与 Windows 中的区域设置有关。

If the region is USA ( Turnudep_reprex_Windows_Region_USA-Settings.jpg ), the python error does NOT appear.如果该地区是美国 ( Turnudep_reprex_Windows_Region_USA-Settings.jpg ),则不会出现 python 错误。 If the region is Turkey ( Turnudep_reprex_Windows_Region_Turkey-Settings.jpg ) the python error DOES appear.如果该地区是土耳其 ( Turnudep_reprex_Windows_Region_Turkey-Settings.jpg ),则会出现python错误。

Unfortunately, setting the region and language to USA does eliminate the python error message but not the other problem.不幸的是,将地区和语言设置为美国确实消除了 python 错误消息,但没有消除其他问题。 Still all the operations with the Turkish variable names end up with an error.使用土耳其语变量名的所有操作仍然以错误告终。

This may be a problem only the BlueSky developers may solve...这可能是只有 BlueSky 开发人员才能解决的问题……

Any help or suggestion will be greatly appreciated.任何帮助或建议将不胜感激。

For version 10, according to user manual chapter 15.1.3 you can adjust the encoding setting.对于版本 10,根据用户手册第 15.1.3 章,您可以调整编码设置。 (answer has been edited for more clarity) (为了更清楚,答案已被编辑)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM