[英]Scrape table data from website using Python
我正在尝试使用 BeautifulSoup4 和 Python 链接从网站上抓取下表数据: 1 : https://i.stack.imgur.com/PfPOQ.png
到目前为止我的代码是
url = "https://www.boerse-frankfurt.de/bond/xs0216072230"
content = requests.get(url)
soup = BeautifulSoup(content.text, 'html.parser')
tbody_data = soup.find_all("table", attrs={"class": "table widget-table"})
table1 = tbody_data[2]
table_body = table1.find('tbody')
rows = table_body.find_all('tr')
for row in rows:
cols = row.find_all('td')
print(cols)
使用此代码,我得到结果: Mycoderesult https://i.stack.imgur.com/C190u.png [Issuer, ] [Industry, ]
我看到发行人,行业,但发行人和行业的价值没有显示在我的结果中。 任何帮助,将不胜感激。 TIA
You are not getting the entire output because data of second td of the table number 6 here is dynamically loaded via JavaScript.So you can mimic that using selenium with pandas.
import pandas as pd
from bs4 import BeautifulSoup
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
import time
from selenium.webdriver.chrome.options import Options
webdriver_service = Service("./chromedriver") #Your chromedriver path
driver = webdriver.Chrome(service=webdriver_service)
url = 'https://www.boerse-frankfurt.de/bond/xs0216072230-fuerstenberg-capital-erste-gmbh-2-522'
driver.get(url)
driver.maximize_window()
time.sleep(3)
table=BeautifulSoup(driver.page_source, 'lxml')
df = pd.read_html(str(table))[5]
print(df)
Output:
0 Issuer Fürstenberg Capital Erste GmbH
1 Industry Industrial and bank bonds
2 Market Open Market
3 Subsegment NaN
4 Minimum investment amount 1000
5 Listing unit Percent
6 Issue date 04/04/2005
7 Issue volume 61203000
8 Circulating volume 61203000
9 Issue currency EUR
10 Portfolio currency EUR
11 First trading day 27/06/2012
12 Maturity NaN
13 Extraordinary cancellation type Call option
14 Extraordinary cancellation date NaN
15 Subordinated Yes
另一种解决方案,仅使用requests
。 请注意,要从服务器获取结果,必须设置所需的标头(标头可以从开发人员工具 -> 网络选项卡中看到)。
import requests
url = (
"https://api.boerse-frankfurt.de/v1/data/master_data_bond?isin=XS0216072230"
)
headers = {
"X-Client-TraceId": "d87b41992f6161c09e875c525c70ffcf",
"X-Security": "d361b3c92e9c50a248e85a12849f8eee",
"Client-Date": "2022-08-25T09:07:36.196Z",
}
data = requests.get(url, headers=headers).json()
print(data)
印刷:
{
"isin": "XS0216072230",
"type": {
"originalValue": "25",
"translations": {
"de": "(Industrie-) und Bankschuldverschreibungen",
"en": "Industrial and bank bonds",
},
},
"market": {
"originalValue": "OPEN",
"translations": {"de": "Freiverkehr", "en": "Open Market"},
},
"subSegment": None,
"cupon": 2.522,
"interestPaymentPeriod": None,
"firstAnnualPayDate": "2006-06-30",
"minimumInvestmentAmount": 1000.0,
"issuer": "Fürstenberg Capital Erste GmbH",
"issueDate": "2005-04-04",
"issueVolume": 61203000.0,
"circulatingVolume": 61203000.0,
"issueCurrency": "EUR",
"firstTradingDay": "2012-06-27",
"maturity": None,
"noticeType": {
"originalValue": "CALL_OPTION",
"translations": {"others": "Call option"},
},
"extraordinaryCancellation": None,
"portfolioCurrency": "EUR",
"subordinated": True,
"flatNotation": {"originalValue": "01", "translations": {"others": "flat"}},
"quotationType": {
"originalValue": "2",
"translations": {"de": "Prozentnotiert", "en": "Percent"},
},
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.