简体   繁体   English

熊猫read_table错误

[英]Pandas read_table errors

I'm new to Python and trying to learn Pandas but running into a problem earlly on. 我是Python的新手,尝试学习Pandas,但早就遇到了问题。 I'm trying to read a log file and save it as a dataframe. 我正在尝试读取日志文件并将其另存为数据框。 It's a space delimited text file with a single header row containing the column names. 这是一个用空格分隔的文本文件,其中的单个标题行包含列名称。 Here's the sample code I'm running just to test the read function. 这是我正在运行的示例代码,仅用于测试读取功能。

import pandas as pd
data = pd.read_table('C:\Aerosonde Test Logs\MH_Data\TEC_20170105-083220\222_1_4435_.log',
               delim_whitespace='True', nrows=20)
print(data)

Below is a snippet of the log file. 以下是日志文件的摘要。

<Clock>[ms] <Year>  <Month> <Day>   <Hours> <Minutes>   <Seconds>   <Lat>[rad]  <Lon>[rad]  <Height>[m]
48161   2017    1   5   4   30  13.366  5.02E-06    8.05E-07    267.37
49161   2017    1   5   4   30  14.366  5.01E-06    7.95E-07    266.61
50161   2017    1   5   4   30  15.366  5.02E-06    7.95E-07    266.24

I keep getting errors though. 我仍然不断出错。 When I try and read the entire log file I'm getting the error. 当我尝试读取整个日志文件时,出现错误。

"UnicodeEncodeError: 'mbcs' codec can't encode characters in position 0--1: invalid character" “ UnicodeEncodeError:'mbcs'编解码器无法对位置0-1--1中的字符进行编码:无效字符”

I tried opening the log file in Excel and then saving it again as a tab delimited file. 我尝试在Excel中打开日志文件,然后再次将其另存为制表符分隔的文件。 When I tried to open that file using the same code I got a seperate error. 当我尝试使用相同的代码打开该文件时,出现一个单独的错误。

"TypeError: an integer is required" “ TypeError:必须为整数”

I tried skipping the header rows thinking the extra characters there were the problem but that didn't fix it either. 我试着跳过标题行,以为出现了多余的字符,但这还是没有解决。 So now I'm at a loss and hoping for some advice! 所以现在我很茫然,希望能提供一些建议!

EDIT: So thanks to Matteo I was able to fix the UnicodeEncodeError by adding '\\' to the filepath string. 编辑:因此,感谢Matteo,我能够通过在文件路径字符串中添加'\\'来修复UnicodeEncodeError。 Now though I get the TypeError: an integer is required when trying to open the log file. 现在,尽管我收到TypeError:尝试打开日志文件时需要一个整数。 I appear to get it when trying to open any space or tab delimeted file. 尝试打开任何空格或制表符删除的文件时,我似乎都明白了。 I just made a quick space delimeted file and I get the same error. 我只是制作了一个快速的空间删除文件,但出现了同样的错误。 I even looked at the data in a hex editor to double check and I don't see any odd bytes so no idea whats happening. 我什至在十六进制编辑器中查看了数据以进行仔细检查,但没有看到奇数字节,因此不知道发生了什么。

Current read table code 当前读取的表代码

data = pd.read_table('C:\\Aerosonde Test Logs\\MH_Data\\TEC_20170105-083220\\TestLogFile.txt',
               delim_whitespace='True')

HexData for test text file 用于测试文本文件的HexData

I thinks that the error is due to the string of the path of log file; 我认为该错误是由于日志文件路径的字符串引起的; you can try to put \\ instead of . 您可以尝试使用\\代替。 See also UnicodeEncodeError: 'mbcs' codec can't encode characters in position 0--1: invalid character upon running a PyInstaller-compiled script 另请参见UnicodeEncodeError:“ mbcs”编解码器无法在位置0--1处编码字符:运行PyInstaller编译的脚本时无效的字符

Matteo Franchi fixed the UnicodeEncodeError by advising that I add an additional '\\' to the filepath string which still left me with the TypeError when I tried to read in the data. Matteo Franchi建议我在文件路径字符串中添加一个额外的'\\'来修复UnicodeEncodeError,当我尝试读取数据时,该字符串仍然给我留下TypeError。 Apparently I was not specifying the delim_whitespace correctly. 显然我没有正确指定delim_whitespace。 I had copied that argument from an example where the True statement was inside of quotes and that does not work. 我从一个例子中复制了该参数,其中True语句位于引号内,但不起作用。 The code below worked fine. 下面的代码工作正常。

data = pd.read_table('C:\\Aerosonde Test Logs\\MH_Data\\TEC_20170105-083220\\222_1_4435_.log',
                 delim_whitespace=True)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM