简体   繁体   English

使用for循环附加多个熊猫数据框,但返回一个空数据框

[英]Appending multiple pandas dataframe using for loop but returns an empty dataframe

I am to download a number of .csv files which I convert to pandas dataframe and append to each other. 我要下载许多.csv文件,这些文件会转换为pandas数据帧并相互附加。

The csv can be accessed via url which is created each day and using datetime it can be easily generated and put in a list. 可以通过每天创建的url访问csv,并使用datetime可以轻松生成它并将其放入列表中。

I am able to open these individually in the list. 我可以在列表中单独打开它们。

When I try to open a number of these and append them together I get an empty dataframe. 当我尝试打开其中一些并将它们附加在一起时,我得到一个空的数据框。 The code looks like this so. 代码看起来像这样。

#Imports
import datetime 
import pandas as pd


#Testing can open .csv file
data = pd.read_csv('https://promo.betfair.com/betfairsp/prices/dwbfpricesukwin01022018.csv')
data.iloc[:5]


#Taking heading to use to create new dataframe 
data_headings = list(data.columns.values)


#Setting up string for url
path_start = 'https://promo.betfair.com/betfairsp/prices/dwbfpricesukwin'
file = ".csv"


#Getting dates which are used in url 
start = datetime.datetime.strptime("01-02-2018", "%d-%m-%Y")
end = datetime.datetime.strptime("04-02-2018", "%d-%m-%Y")
date_generated = [start + datetime.timedelta(days=x) for x in range(0, (end-start).days)]


#Creating new dataframe which is appended to
for heading in data_headings:
    data = {heading: []}

df = pd.DataFrame(data, columns=data_headings)


#Creating list of url
date_list = []

for date in date_generated:
    date_string = date.strftime("%d%m%Y")

    x = path_start + date_string + file
    date_list.append(x)


#Opening and appending csv files from list which contains url
for full_path in date_list:
    data_link = pd.read_csv(full_path)
    df.append(data_link)

print(df)

I have checked that they are not just empty csv but they are not. 我已经检查过它们不仅是空的csv,而且不是。 Any help would be appreciated. 任何帮助,将不胜感激。

Cheers, Sandy 干杯,桑迪

You are never storing the appended dataframe. 您永远不会存储附加的数据框。 The line: 该行:

df.append(data_link)

Should be 应该

df = df.append(data_link)

However, this may be the wrong approach. 但是,这可能是错误的方法。 You really want to use the array of URLs and concatenate them. 您确实想使用URL数组并将它们连接起来。 Check out this similar question and see if it can improve your code! 看看这个类似的问题 ,看看它是否可以改善您的代码!

I really can't understand what you wanted to do here: 我真的不明白您想在这里做什么:

#Creating new dataframe which is appended to
for heading in data_headings:
    data = {heading: []}

df = pd.DataFrame(data, columns=data_headings)

By the way, try this: 顺便尝试一下:

for full_path in date_list:
    data_link = pd.read_csv(full_path)
    df.append(data_link.copy())

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM