[英]ValueError when reading a sas file with pandas
pandas.read_sas() prints traceback messages that I cannot remove. pandas.read_sas()打印我无法删除的回溯消息。 The problem is it prints messages for EACH row it's reading, so when I try to read the whole file it just freezes printing too much.
问题是它为正在读取的每一行打印消息,所以当我尝试读取整个文件时,它只会冻结打印太多。
I tried from other stackoverflow answers我从其他stackoverflow答案中尝试过
import warnings
warnings.simplefilter(action='ignore')
And和
warnings.filterwarnings('ignore')
And和
from IPython.display import HTML
HTML('''<script>
code_show_err=false;
function code_toggle_err() {
if (code_show_err){
$('div.output_stderr').hide();
} else {
$('div.output_stderr').show();
}
code_show_err = !code_show_err
}
$( document ).ready(code_toggle_err);
</script>
To toggle on/off output_stderr, click <a
href="javascript:code_toggle_err()">here</a>.''')
But nothing works.但没有任何效果。
The message it prints is:它打印的消息是:
--------------------------------------------------------------------------- ValueError Traceback (most recent call last) pandas\io\sas\sas.pyx in pandas.io.sas._sas.rle_decompress()
-------------------------------------------------- ------------------------- ValueError Traceback (最近一次调用最后) pandas\io\sas\sas.pyx in pandas.io.sas._sas .rle_decompress()
ValueError: Unexpected non-zero end_of_first_byte
ValueError:意外的非零 end_of_first_byte
Exception ignored in: 'pandas.io.sas._sas.Parser.process_byte_array_with_data' Traceback (most recent call last): File "pandas\io\sas\sas.pyx", line 29, in pandas.io.sas._sas.rle_decompress ValueError: Unexpected non-zero end_of_first_byte
异常被忽略:'pandas.io.sas._sas.Parser.process_byte_array_with_data' Traceback(最近一次调用最后):文件“pandas\io\sas\sas.pyx”,第 29 行,在 pandas.io.sas._sas。 rle_decompress ValueError:意外的非零 end_of_first_byte
As highlighted in the traceback, the error is caused by a bug in the pandas
implementation of RLE decompression, which is used when the SAS dataset is exported using CHAR (RLE) compression.正如回溯中突出显示的那样,该错误是由
pandas
RLE 解压缩实现中的错误引起的,该错误在使用 CHAR (RLE) 压缩导出 SAS 数据集时使用。
Note the pandas
issue created for this topic: https://github.com/pandas-dev/pandas/issues/31243请注意为此主题创建的
pandas
问题: https ://github.com/pandas-dev/pandas/issues/31243
The resolution that pandas
implemented for this bug in read_sas
is contained in the following Pull Request, which is part of the version 1.5 milestone, yet to be released at the time of answering: https://github.com/pandas-dev/pandas/pull/47113 pandas
针对read_sas
中的这个 bug 实现的解决方案包含在以下 Pull Request 中,它是 1.5 版里程碑的一部分,在回答时尚未发布: https ://github.com/pandas-dev/pandas /拉/47113
To answer your question, you have two options:要回答您的问题,您有两种选择:
pandas
releases version 1.5, update to that version, and read_sas
should then work as expected.pandas
发布 1.5 版,更新到该版本,然后read_sas
应该可以按预期工作。 You've already been waiting awhile since you asked, so I suspect this will be fine.sas7bdat
library instead ( https://pypi.org/project/sas7bdat/ ), and then convert to a pandas
DataFrame:sas7bdat
库 ( https://pypi.org/project/sas7bdat/ ),然后转换为pandas
DataFrame: from sas7bdat import SAS7BDAT
df = SAS7BDAT("./path/to/file.sas7bdat").to_data_frame()
The sas7bdat
approach worked for me, after facing the exact same error as you did.在遇到与您完全相同的错误之后,
sas7bdat
方法对我有用。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.