简体   繁体   中英

Combine Dataframes from Multiple Webpages in One Excel

I currently manually check 10-15 different webpages for dividend information on various stocks. This is normally done once a month, but would be beneficial if I could run this activity automatically each day, overwriting the prior day's data

The website I use requires a login, so I have used Selenium to log into the website, and then extract data using the following code for a single position:

page = driver.get(URL)
df = pd.read_html(driver.page_source)[0]
print(df.head())

URL = The individual stock URL on the website

I then do the following to write to excel:

df.to_excel(excelfilename)

How would I complete the above for these 10-15 different pages, and save it all to a single excel sheet which overwrites the previous data each time?

I'm relatively new to Python, so apologies if this is quite straight forward.

Thanks to anyone who can help in advance!

Assuming that you can loop in such a simple way, you can loop over your URLs and save the individual dataframes in a list and then use pd.concat() to assemble all data into one dataframe. You need to decide, axis= 0 or 1 depending on the formatting of your data. Check pandas documentation to see all the options for concatenating.

dfs = []
for URL in URLS:
    page = driver.get(URL)
    df = pd.read_html(driver.page_source)[0]
    dfs.append(df)

master_df = pd.concat(dfs)

Finally

master_df.to_excel(excelfilename)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM