[英]Pandas concatenate dataframes with for loop
我正在嘗試從網站獲取表格。 該網站的 URL 包含日期,因此我必須遍歷日期才能獲取歷史數據。 我生成的日期如下:
import datetime
start = datetime.datetime.strptime("26-09-2016", "%d-%m-%Y")
end = datetime.datetime.strptime("30-09-2016", "%d-%m-%Y")
date_generated = [start + datetime.timedelta(days=x) for x in range(0, (end-start).days)]
dates_list = []
for date in date_generated:
txt = str(str(date.day) + '.' + str(date.month) + '.' + str(date.year))
dates_list.append(txt)
dates_list
在此之后,我運行下面的代碼來連接所有表:
for i in range(0, 3):
allURL = 'https://www.uzse.uz/trade_results?date=' + dates_list[i] + '&locale=en&mkt_id=ALL&page=%d'
ndf_list = []
for i in range(1, 100):
url = allURL %i
if pd.read_html(url)[0].empty:
break
else :
ndf_list.append(pd.read_html(url)[0])
ndf = pd.concat(ndf_list)
ndf.insert(0, 'Date', dates_list[i])
mdf = pd.concat(ndf, ignore_index = True)
mdf
但是,這不起作用,我得到:
TypeError: first argument must be an iterable of pandas objects, you passed an object of type "DataFrame"
我不明白我做錯了什么。 我期待有一張來自 9 月 26 日、27 日和 28 日的桌子。
請幫忙。
不確定最后一行,但我會這樣處理
import datetime
import pandas as pd
start = datetime.datetime.strptime("26-09-2016", "%d-%m-%Y")
end = datetime.datetime.strptime("30-09-2016", "%d-%m-%Y")
date_generated = [
start + datetime.timedelta(days=x) for x in range(0, (end-start).days)]
dates_list = []
for date in date_generated:
txt = str(str(date.day) + '.' + str(date.month) + '.' + str(date.year))
dates_list.append(txt)
dates_list
ndf = pd.DataFrame() # create empty ndf
for i in range(0, 3):
allURL = 'https://www.uzse.uz/trade_results?date=' + \
dates_list[i] + '&locale=en&mkt_id=ALL&page=%d'
# ndf_list = []
for j in range(1, 100):
url = allURL % j
if pd.read_html(url)[0].empty:
break
else:
# ndf_list.append(pd.read_html(url)[0])
chunk = pd.read_html(url)[0]
chunk['Date'] = dates_list[i] # Date is positioned at last position, let's fix that
# get a list of all the columns
cols = chunk.columns.tolist()
# rearrange the columns, move the last element (Date) to the first position
cols = cols[-1:] + cols[:-1]
# reorder the dataframe
chunk = chunk[cols]
ndf = pd.concat([ndf, chunk])
# ndf = pd.concat(ndf_list)
# ndf.insert(0, 'Date', dates_list[i])
print(ndf)
# mdf = pd.concat(ndf, ignore_index=True)
# mdf
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.