如何使用 pandas 写入现有 excel 文件而不覆盖现有数据

Question

我知道之前已经发布过类似的问题，但我还没有找到适合这种情况的东西。 我希望你能帮忙。

以下是问题的摘要：

我正在使用硒编写 web 抓取代码（用于分配目的）
该代码利用一个for循环将go从一个页面转移到另一个页面
代码的 output 是 dataframe 从每个页码导入到 excel。 （基本上是一张桌子）
来自所有 web 页面的数据帧仅在一张 excel 表中捕获。（excel 文件中的多张表不）
每个 web 页面具有相同的数据格式（即列数和列标题相同，但行值不同..）
有关信息，我正在使用 pandas 因为它帮助将 output 从网站转换为 excel

我面临的问题是，当 dataframe 导出到 excel 时，它会覆盖上一次迭代的数据。 因此，当我运行代码并完成抓取时，我只会从最后一次 for 循环迭代中获取数据。

请告知我需要添加的编码行，以便在 excel 表中捕获所有迭代，换句话说，更具体地说，每次迭代都应从第一个空行开始将数据导出到 excel。

这是代码的摘录：

for i in range(50, 60):  
    url= (urlA + str(i)) #this is the url generator, URLA is the main link excluding pagination

    driver.get(url)

    time.sleep(random.randint(3,7))

    text=driver.find_element_by_xpath('/html/body/pre').text

    data=pd.DataFrame(eval(text))

    export_excel = data.to_excel(xlpath)

Answer 1

谢谢迪克格拉夫。 你的提议奏效了。

这是其他人的完整代码（供将来参考）。

为字体道歉，无法正确设置。 无论如何，希望下面对将来的某些人有用。

xlpath= "c:/projects/excelfile.xlsx"

df=pd.DataFrame() #creating a data frame before the for loop. (dataframe is empty before the for loop starts)

Url= www.your website.com 

for i in irange(1,10): 

       url= (urlA + str(i)) #this is url generator for pagination (to loop thru the page) 

       driver.get(url)  

       text=driver.find_element_by_xpath('/html/body/pre').text # gets text from site

       data=pd.DataFrame(eval(text)) #evalues the extracted text from site and converts to Pandas dataframe 

       df=df.append(data) #appends the dataframe (df) specificed before the for-loop and adds the new (data)

export_excel = df.to_excel(xlpath)  #exports consolidated dataframes (df) to excel

如何使用 pandas 写入现有 excel 文件而不覆盖现有数据

问题描述

1 个解决方案

解决方案1
1 已采纳 2019-10-09 13:51:10

如何使用 pandas 写入现有 excel 文件而不覆盖现有数据

问题描述

1 个解决方案

解决方案1 1 已采纳 2019-10-09 13:51:10

解决方案1
1 已采纳 2019-10-09 13:51:10