[英]reading csv files: list index out of range
我應該閱讀有關唐納德特朗普的 facebook 更新的 CSV 文件。 我需要在這樣的列表中創建字典:
[{'link_name': 'Timeline Photos',
'num_angrys': '7',
'num_comments': '543',
'num_hahas': '17',
'num_likes': '6178',
'num_loves': '572',
'num_reactions': '6813',
'num_sads': '0',
'num_shares': '359',
'num_wows': '39',
'status_id': '153080620724_10157915294545725',
'status_link': 'https://www.facebook.com/DonaldTrump/photos/a.488852220724.393301.153080620724/10157915294545725/?type=3',
'status_message': 'Beautiful evening in Wisconsin- THANK YOU for your incredible support tonight! Everyone get out on November 8th - and VOTE! LETS MAKE AMERICA GREAT AGAIN! -DJT',
'status_published': '10/17/2016 20:56:51',
'status_type': 'photo'},
使用代碼。 我需要獲取前兩個狀態更新,但是當我輸入代碼時,我收到一條錯誤消息,顯示“列表索引超出范圍”。
這是代碼
def read_csv(input_file, delimiter=","):
# your code here
import csv
csv_data= []
with open(filename, "r") as csvfile:
for row in csvfile:
row = row.strip("\n")
columns = row.split(",")
dict_row = {"link_name": columns [0],
"num_angrys": columns [1],
"num_comments":columns[2],
"num_hahas": columns [3],
"num_loves": columns [4],
"num_reactions": columns [5],
"num_sads": columns [6],
"num_shares": columns[7],
"num_wows": columns [8],
"status_id": columns[9],
"status_link": columns[10],
"status_message": columns [11],
"status_published": columns[12],
"status_type": columns[13]}
csv_data.append(dict_row)
filename = "../Data/csv_data/trump_facebook.tsv"
status_updates = read_csv(filename, delimiter="\t")
status_updates[0:2]
這是錯誤信息
IndexError Traceback (most recent call
last)
<ipython-input-16-352e8f130d5d> in <module>
27 filename = "../Data/csv_data/trump_facebook.tsv"
---> 28 status_updates = read_csv(filename, delimiter="\t")
29 status_updates[0:2]
<ipython-input-16-352e8f130d5d> in read_csv(input_file, delimiter)
9
10 dict_row = {"link_name": columns [0],
---> 11 "num_angrys": columns [1],
12 "num_comments":columns[2],
13 "num_hahas": columns [3],
IndexError: list index out of range
任何幫助將不勝感激!
更新:我已經用這個新代碼解決了這個問題,但是打印 status_updates [0:2] 給我一個 output ,標題如下:
def read_csv(input_file, delimiter=","):
# your code here
csv_data= []
with open(filename, "r") as csvfile:
for row in csvfile:
row = row.strip("\n")
columns = row.split("\t")
dict_row = {"link_name":columns[2],
"num_angrys" : columns[14],
"num_comments": columns[7],
"num_hahas": columns[12],
"num_likes": columns[9],
"num_loves": columns[10],
"num_reactions": columns [6],
"num_sads": columns[13],
"num_shares": columns [8],
"num_wows": columns [11],
"status_id": columns [0],
"status_link": columns [4],
"status_message": columns [1],
"status_published": columns [5],
"status_type": columns [3],}
csv_data.append(dict_row)
return csv_data
filename = "../Data/csv_data/trump_facebook.tsv"
status_updates = read_csv(filename, delimiter="\t")
status_updates[0:2]
output:
[{'link_name': 'link_name',
'num_angrys': 'num_angrys',
'num_comments': 'num_comments',
'num_hahas': 'num_hahas',
'num_likes': 'num_likes',
'num_loves': 'num_loves',
'num_reactions': 'num_reactions',
'num_sads': 'num_sads',
'num_shares': 'num_shares',
'num_wows': 'num_wows',
'status_id': 'status_id',
'status_link': 'status_link',
'status_message': 'status_message',
'status_published': 'status_published',
'status_type': 'status_type'},
{'link_name': 'Timeline Photos',
'num_angrys': '7',
'num_comments': '543',
'num_hahas': '17',
'num_likes': '6178',
'num_loves': '572',
'num_reactions': '6813',
'num_sads': '0',
'num_shares': '359',
'num_wows': '39',
'status_id': '153080620724_10157915294545725',
'status_link': 'https://www.facebook.com/DonaldTrump/photos/a.488852220724.393301.153080620724/10157915294545725/?type=3',
'status_message': 'Beautiful evening in Wisconsin- THANK YOU for your
incredible support tonight! Everyone get out on November 8th - and VOTE! LETS
MAKE AMERICA GREAT AGAIN! -DJT',
'status_published': '10/17/2016 20:56:51',
'status_type': 'photo'}]
我可以輕松地將 status_update[0:2] 替換為 [1:3],但必須有一種更優雅的方法來刪除 header 行,因此我不必擔心每次調用此 function 時使用 1 索引。 感謝您的所有幫助!
csv 文件是什么樣的。 我看到 function 調用使用分隔符作為\t
但實際代碼始終使用,
。
此外,您可能需要考慮使用 python csv
模塊來完成這項工作。
您可以使用這種方法解決您的問題:
1. read first string of csv file as header 2. construct mapping "header-row" using `zip()` function
def row_preprocess(row, delimiter='\t'):
return row.strip('\n').split(delimiter)
def read_csv(path_to_file, delimiter='\t'):
csv_data = []
with open(path_to_file, 'r') as f:
column_values = row_preprocess(next(f), delimiter)
for row in f:
row_values = row_preprocess(row, delimiter)
mapping = dict(zip(column_values, row_values))
csv_data.append(mapping)
return csv_data
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.