I have a while loop, and a dataframe is generated each iteration. I want to merge dataframes after every iteration on a key (let's say column id
):
while i < 600:
try:
player_html = urlopen("https://fantasy.premierleague.com/drf/element-summary/" + str(i))
player_raw = json.load(player_html)
fixture = player_raw['fixtures']
data_df = pd.DataFrame(fixture)
new_column = data_df.columns
new_df = pd.DataFrame(columns=new_column)
new_df = new_df.merge(data_df, on='id')
except:
#Write all of the numbers for which there was errors to a file
errfile = open(player_error, "a")
errfile.write(str(i) + "\n")
pass
print (i)
i += 1
return new_df
This was my logic, but it is not working. How can I fix this? Thanks.
My guess is that data_df
should be the initial batch of df that any consequent new_df
must be appended.
However, as it is inside the while loop, it's keep on getting reset. That being said, assigning data_df before the loop should do the job.
data = pandas.read_excel(*******)
data_df = pd.DataFrame(data)
while i < 600:
try:
new_column = data_df.columns
new_df = pd.DataFrame(columns=new_column)
new_df = new_df.merge(data_df, on='id')
except:
#Write all of the numbers for which there was errors to a file
errfile = open(player_error, "a")
errfile.write(str(i) + "\n")
pass
print (i)
i += 1
In addition,
pandas.read_excel
returns DataFrame
so the second line could be redundant. pandas
is imported but if it was import pandas as pd
, do use pd only (ie first line)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.