[英]PYTHON: appending a dataframe in a loop
我正在嘗試從 2 個不同的 url 檢索股票信息並將信息寫入熊貓的數據框架。 但是,我不斷收到錯誤。 有人可以幫我嗎? 我對 python 很陌生,所以我的代碼可能看起來很丑:D
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.action_chains import ActionChains
from selenium.webdriver.common.by import By
import os
import requests
from bs4 import BeautifulSoup
import pandas as pd
headers= {
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:87.0) Gecko/20100101 Firefox/87.0',
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8',
'Accept-Language': 'en-US,en;q=0.5',
'Connection': 'keep-alive',
'Upgrade-Insecure-Requests': '1',
'Cache-Control': 'max-age=0'
}
PATH='C:\Program Files (x86)\chromedriver.exe'
options = Options()
options = webdriver.ChromeOptions()
options.add_argument('headless')
options.add_argument("--window-size=2550,1440")
s = Service('C:\Program Files (x86)\chromedriver.exe')
driver = webdriver.Chrome(PATH, options=options)
driver.implicitly_wait(10)
#maak een dataframe aan
dn=[]
def accept_cookies():
try:
driver.find_element(By.ID, 'accept-choices').click()
except:
print('fu')
stocklist=["FB","KLIC"]
for x in stocklist:
url = f"https://stockanalysis.com/stocks/{x}/financials/"
driver.get(url)
driver.implicitly_wait(10)
accept_cookies()
driver.implicitly_wait(10)
driver.find_element(By.XPATH, "//span[text()='Quarterly']").click()
xlwriter = pd.ExcelWriter(f'financial statements1.xlsx', engine='xlsxwriter')
soup = BeautifulSoup(driver.page_source, 'html.parser')
df = pd.read_html(str(soup), attrs={'id': 'financial-table'})[0]
new_df = pd.concat(df)
dn.to_excel(xlwriter, sheet_name='key', index=False)
xlwriter.save()
pd.concat
需要一個要連接的對象列表,而您只給了它df
。
所以我認為用pd.concat([df, new_df])
替換pd.concat(df)
並在 for 循環之前有new_df = pd.DataFrame()
。
如果.read_html()
部分沒有問題,您應該將df
推送到數據框列表:
dflist =[]
for x in stocklist:
url = f"https://stockanalysis.com/stocks/{x}/financials/"
driver.get(url)
driver.implicitly_wait(10)
accept_cookies()
driver.implicitly_wait(10)
driver.find_element(By.XPATH, "//span[text()='Quarterly']").click()
soup = BeautifulSoup(driver.page_source, 'html.parser')
dflist.append(pd.read_html(str(soup), attrs={'id': 'financial-table'})[0])
完成迭代,您可以簡單地將數據框列表連接到一個:
xlwriter = pd.ExcelWriter(f'financial statements1.xlsx', engine='xlsxwriter')
pd.concat(dflist).to_excel(xlwriter, sheet_name='key', index=False)
xlwriter.save()
dflist =[]
for x in stocklist:
url = f"https://stockanalysis.com/stocks/{x}/financials/"
driver.get(url)
driver.implicitly_wait(10)
accept_cookies()
driver.implicitly_wait(10)
driver.find_element(By.XPATH, "//span[text()='Quarterly']").click()
soup = BeautifulSoup(driver.page_source, 'html.parser')
dflist.append(pd.read_html(str(soup), attrs={'id': 'financial-table'})[0])
xlwriter = pd.ExcelWriter(f'financial statements1.xlsx', engine='xlsxwriter')
pd.concat(dflist).to_excel(xlwriter, sheet_name='key', index=False)
xlwriter.save()
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.