简体   繁体   English

pd.read_csv 没有创建具有适当属性的 dataframe

[英]pd.read_csv is not creating a dataframe with appropriate attributes

I'm trying to get pandas to read as csv that's several million data entries large, and when I try to cut the data down to the relevant columns我试图让 pandas 读取为 csv 这是几百万个数据条目大,当我尝试将数据减少到相关列时

It's early stages of the code, and can't proceed without the appropriate data这是代码的早期阶段,没有适当的数据就无法继续

import pandas as pd


cols =  [1, 5, 6, 7, 10]
col_index = ["PGC", "GWGC", "HyperLEDA", "2MASS", "SDSS-DR12", "flag", "RA", "dec", "Luminosity Distance", "Distance Error", "Redshift", "Apparent B Magnitude", "B Magnitude Error",
"Apparent J Magnitude", "J magnitude error", "Apparent H Magnitude", "H Magnitude Error", "K Magnitude", "K Magnitude Error",
"Flag2", "Flag3"]#1, 5, 6, 7, 10]

df_cat = print(pd.read_csv("GLADE_2.3 - Copy.csv", chunksize = 10**8, index_col = col_index, usecols = cols))

print(df_cat.head())

AttributeError: 'NoneType' object has no attribute 'head' AttributeError: 'NoneType' object 没有属性 'head'

It looks like the csv hasn't been read in successfully, and I'm aware that with such large files, it's probable that there could be a better way to handle the file - any and all suggestions appreciated看起来 csv 尚未成功读入,我知道对于如此大的文件,可能有更好的方法来处理文件 - 任何和所有建议都值得赞赏

EDIT: Thank you so much to everyone who's answered!编辑:非常感谢所有回答的人! I really appreciate the help as I'm just trying to get to grips with pandas and keep mix-and-matching with built in modules我非常感谢您的帮助,因为我只是想掌握 pandas 并与内置模块保持混合匹配

EDIT: nevermind... I didn't see that this has already been answered in the comments by @Kris and @ALollz.编辑:没关系...我没有看到@Kris 和@ALollz 在评论中已经回答了这个问题。

It looks like your 'df_cat" is a string that results from a print statement. If you delete the print() statement that precedes the pd.read_csv(), I think it will work.看起来你的“df_cat”是一个由打印语句产生的字符串。如果你删除 pd.read_csv() 之前的 print() 语句,我认为它会起作用。

# delete the print statement here
df_cat = pd.read_csv("GLADE_2.3 - Copy.csv", chunksize = 10**8, index_col = col_index, usecols = cols)

# keep the print statement here
print(df_cat.head())

To expand on my comment above:要扩展我上面的评论:

Try changing your code as follows:尝试按如下方式更改您的代码:

import pandas as pd

cols =  [1, 5, 6, 7, 10]
col_index = ["PGC", "GWGC", "HyperLEDA", "2MASS", "SDSS-DR12", "flag", 
    "RA", "dec", "Luminosity Distance", "Distance Error", "Redshift", 
    "Apparent B Magnitude", "B Magnitude Error",
    "Apparent J Magnitude", "J magnitude error", "Apparent H Magnitude", "H 
    Magnitude Error", "K Magnitude", "K Magnitude Error",
   "Flag2", "Flag3"]#1, 5, 6, 7, 10]

df_cat = pd.read_csv("GLADE_2.3 - Copy.csv", chunksize = 10**8, index_col = col_index, usecols = cols)

df_cat.head()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM