I know similar questions have been posted before, but i haven't found something working for this case. I hope you can help.
Here is a summary of the issue:
The problem i'm facing is that when the dataframe is exported to excel, it over-writes the data from the previous iteration. hence, when i run the code and scraping is completed, I will only get the data from the last for-loop iteration.
Please advise the line(s) of coding i need to add in order for all the iterations to be captured in the excel sheet, in other words and more specifically, each iteration should export the data to excel starting from the first empty row.
Here is an extract from the code:
for i in range(50, 60):
url= (urlA + str(i)) #this is the url generator, URLA is the main link excluding pagination
driver.get(url)
time.sleep(random.randint(3,7))
text=driver.find_element_by_xpath('/html/body/pre').text
data=pd.DataFrame(eval(text))
export_excel = data.to_excel(xlpath)
Thanks Dijkgraaf. Your proposal worked.
Here is the full code for others (for future reference).
apologies for the font, couldnt set it properly. anyway hope below is to some use for someone in the future.
xlpath= "c:/projects/excelfile.xlsx"
df=pd.DataFrame() #creating a data frame before the for loop. (dataframe is empty before the for loop starts)
Url= www.your website.com
for i in irange(1,10):
url= (urlA + str(i)) #this is url generator for pagination (to loop thru the page)
driver.get(url)
text=driver.find_element_by_xpath('/html/body/pre').text # gets text from site
data=pd.DataFrame(eval(text)) #evalues the extracted text from site and converts to Pandas dataframe
df=df.append(data) #appends the dataframe (df) specificed before the for-loop and adds the new (data)
export_excel = df.to_excel(xlpath) #exports consolidated dataframes (df) to excel
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.