简体   繁体   English

dataframe:pd.read_csv 错误

[英]dataframe: pd.read_csv bug

I have a huge text file that I read by pd.read_csv.我有一个由 pd.read_csv 读取的巨大文本文件。 But it can't read a special line in the dataframe and return Nan values for that line.但它无法读取 dataframe 中的特殊行并返回该行的 Nan 值。 I got that if I add a space to the line, everything works well.我知道,如果我在该行中添加一个空格,一切都会运行良好。 But I can't do this since the format of my text file can't be changed at all.但我不能这样做,因为我的文本文件的格式根本无法更改。 Does anyone work with large text files and faced the same problem?有没有人使用大型文本文件并面临同样的问题?

I think here it's not possible to send a file, but if you think that you can find a solution, I can send my file and code by email to you.我认为这里不可能发送文件,但是如果您认为可以找到解决方案,我可以通过 email 将我的文件和代码发送给您。

Thanks in advance.提前致谢。

It's hard to understand without an example of the file you're handling with.如果没有您正在处理的文件的示例,很难理解。 But, if you know the index of that line you can just pass a skiprows argument when reading the csv so that line would not be read to the dataframe.但是,如果您知道该行的索引,则可以在读取 csv 时传递一个skiprows参数,这样该行就不会被读取到 dataframe。

ie if the line you need to skip is in index 4:即,如果您需要跳过的行在索引 4 中:

df = pd.read_csv('file.csv', skiprows=4)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM