簡體   English   中英

在沒有訂閱的情況下使用 pandas 將 DataFrame 導出到 Excel

[英]Exporting DataFrame to Excel using pandas without subscribe

如何在沒有訂閱的情況下將 DataFrame 導出到 excel? 例如:我正在做網頁抓取,並且有一個帶有分頁的表格,所以我將第 1 頁保存在 DataFrame 中,導出到 excel e 在第 2 頁再次執行。但是當保存時,每條記錄都將被刪除一。 對不起我的英語,這是我的代碼:

 import time import pandas as pd from bs4 import BeautifulSoup from selenium import webdriver i=1 url = "https://stats.nba.com/players/traditional/?PerMode=Totals&Season=2019-20&SeasonType=Regular%20Season&sort=PLAYER_NAME&dir=-1" driver = webdriver.Firefox(executable_path=r'C:/Users/Fabio\Desktop/robo/geckodriver.exe') driver.get(url) time.sleep(5) driver.find_element_by_xpath("/html/body/main/div[2]/div/div[2]/div/div/nba-stat-table/div[2]/div[1]/table/thead/tr/th[9]").click() contador = 1 #loop pagination while(contador < 4): #findind table elemento = driver.find_element_by_xpath("/html/body/main/div[2]/div/div[2]/div/div/nba-stat-table/div[2]") html_content = elemento.get_attribute('outerHTML') # 2. Parse HTML - BeaultifulSoup soup = BeautifulSoup(html_content, 'html.parser') table = soup.find(name='table') # 3. Data Frame - Pandas df_full = pd.read_html(str(table))[0] df = df_full[['PLAYER','TEAM', 'PTS']] df.columns = ['jogador','time', 'pontuacao'] dados1 = pd.DataFrame(df) driver.find_element_by_xpath("/html/body/main/div[2]/div/div[2]/div/div/nba-stat-table/div[1]/div/div/a[2]").click() contador = contador + 1 #4. export to excel dados = pd.DataFrame(df) dados.to_excel("fabinho.xlsx") driver.quit()

每次通過循環 go 時,您都將 df 重新分配給您檢索到的任何數據。 一個解決方案是將 append 數據放到一個列表中,然后 pd.concat 將列表放在最后。

 import time import pandas as pd from bs4 import BeautifulSoup from selenium import webdriver i=1 url = "https://stats.nba.com/players/traditional/?PerMode=Totals&Season=2019-20&SeasonType=Regular%20Season&sort=PLAYER_NAME&dir=-1" driver = webdriver.Firefox(executable_path=r'C:/Users/Fabio\Desktop/robo/geckodriver.exe') driver.get(url) time.sleep(5) driver.find_element_by_xpath("/html/body/main/div[2]/div/div[2]/div/div/nba-stat-table/div[2]/div[1]/table/thead/tr/th[9]").click() contador = 1 df_list = list() #loop pagination while(contador < 4): #findind table elemento = driver.find_element_by_xpath("/html/body/main/div[2]/div/div[2]/div/div/nba-stat-table/div[2]") html_content = elemento.get_attribute('outerHTML') # 2. Parse HTML - BeaultifulSoup soup = BeautifulSoup(html_content, 'html.parser') table = soup.find(name='table') # 3. Data Frame - Pandas df_full = pd.read_html(str(table))[0] df = df_full[['PLAYER','TEAM', 'PTS']] df.columns = ['jogador','time', 'pontuacao'] df_list.append(df) driver.find_element_by_xpath("/html/body/main/div[2]/div/div[2]/div/div/nba-stat-table/div[1]/div/div/a[2]").click() contador = contador + 1 #4. export to excel dados = pd.concat(df_list) dados.to_excel("fabinho.xlsx") driver.quit()

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM