简体   繁体   English

无法将 .csv 文件内容正确加载到 python 数据框中

[英]unable to load .csv file contents correctly into python data frame

For an in class assignment, I am attempting to load a csv file into a dataframe in python using Jupyter notebook.对于课堂作业,我尝试使用 Jupyter 笔记本将 csv 文件加载到 python 中的数据帧中。

The below is my attempt.下面是我的尝试。 I have defined columns as such:我已经定义了列:

gnacs_y = "id|postedTime|body|None1|['twitter_entiteis:urls:url']|None2|['actor:languages_list-items']|gnip:language:value|twitter_lang|[u'geo:coordinates_list-items']|geo:type|None3|None4|None5|None6|actor:utcOffset|None7|None8|None9|None10|None11|None12|None13|None14|None15|actor:displayName|actor:preferredUsername|actor:id|gnip:klout_score|actor:followersCount|actor:friendsCount|actor:listedCount|actor:statusesCount|Tweet|None16|None17|None18"
colnames = gnacs_y.split('|')

Then I have the following:然后我有以下内容:

df_3 = pd.read_csv('../data/twitter_sample.csv', sep='|', names=colnames)

df_3.tail(10)

However when the data gets loaded i see only the ID column with what seems like HTML code text and all the other columns are NaN where as there is data in the .CSV file.然而,当数据被加载时,我只看到 ID 列看起来像 HTML 代码文本,所有其他列都是 NaN,因为 .CSV 文件中有数据。 I have attached screen shots of what i see in the jupyter notebook and the content of the CSV file.我附上了我在 jupyter notebook 中看到的屏幕截图和 CSV 文件的内容。 I am not sure if I messed up the initial declaration of the column names under gancs_y.我不确定我是否搞砸了 gancs_y 下列名的初始声明。

Link to the CSV file for the assignment: https://github.com/terratenney/yorkBigData/blob/master/assignments/data/twitter_sample.csv作业的 CSV 文件链接: https : //github.com/terratenney/yorkBigData/blob/master/assignments/data/twitter_sample.csv

Any help would be great appreciated任何帮助将不胜感激负载结果 .csv 文件内容

Your file is not a csv file, it's a html file that has tables in it.您的文件不是csv文件,而是一个包含表格的html文件。 If your assignment says you should have a csv file have you checked you downloaded the right file?如果你的作业说你应该有一个csv文件,你有没有检查过你下载了正确的文件?

EDIT: It looks like you messed up saving your file - if you click the RAW button on Github and download that, it will give you the correct csv file without the html编辑:看起来你保存文件搞砸了 - 如果你点击 Github 上的 RAW 按钮并下载它,它会给你没有html的正确csv文件

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM