簡體   English   中英

用漂亮的湯刮奇怪的桌子結構

[英]Scraping weird table structure using beautiful soup

我正在嘗試從該網站上抓取幾個隱藏在展開按鈕下的表格。 但是,看起來表結構真的很奇怪,我遇到了麻煩。

到目前為止我的代碼是

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.keys import Keys
import time
import requests
from bs4 import BeautifulSoup

groups = ['Currencies', 'Commodities', 'Indices']
assets = {'Currencies': ['eur/usd','gbp/usd', 'usd/jpy'], 'Commodities': ['gold', 'silver', 'copper'], 'Indices': ['spx50', 'dj30']}


browser = webdriver.Chrome(executable_path=r'/usr/local/bin/chromedriver', options=Options())
browser.get('https://www.etoro.com/trading/fees/#cfds')
soup = BeautifulSoup(browser.page_source, 'lxml')

我的目標是從網站的“點差”部分(不是“隔夜費用”部分)、我的groups變量中的每個資產組以及每個組中的資產中的那些單獨assets獲取費用部分。 例如,每個組都是這樣的

| Instrument | Fee    |
|------------|--------|
| eur/usd    | 1 Pips |
| gbp/usd    | 2 Pips |
| usd/jpy    | 1 Pips |

這是來自美麗湯的貨幣表 html 的示例:

<thead>
<tr>
</tr>
</thead>
<tbody>
<tr class="tr-decade-0" data-decade="0" data-search_val="eurusd eur/usd">
<td class="id-1"><a class="e-info" href="/markets/eurusd">
<div class="image" style='background-image:url("https://etoro-cdn.etorostatic.com/market-avatars/eur-usd/150x150.png")'></div>
<div class="details">
<div class="symbol">EURUSD</div>
<div class="name">EUR/USD</div>
</div>
</a></td><td data-th="Spread"><span class="spread_num">1</span> Pips</td></tr>
<tr class="tr-decade-0" data-decade="0" data-search_val="usdjpy usd/jpy">
<td class="id-5"><a class="e-info" href="/markets/usdjpy">
<div class="image" style='background-image:url("https://etoro-cdn.etorostatic.com/market-avatars/usd-jpy/150x150.png")'></div>
<div class="details">
<div class="symbol">USDJPY</div>
<div class="name">USD/JPY</div>
</div>
</a></td><td data-th="Spread"><span class="spread_num">1</span> Pips</td></tr>
<tr class="tr-decade-0" data-decade="0" data-search_val="gbpusd gbp/usd">
<td class="id-2"><a class="e-info" href="/markets/gbpusd">
<div class="image" style='background-image:url("https://etoro-cdn.etorostatic.com/market-avatars/gbp-usd/150x150.png")'></div>
<div class="details">
<div class="symbol">GBPUSD</div>
<div class="name">GBP/USD</div>
</div>
</a></td><td data-th="Spread"><span class="spread_num">2</span> Pips</td></tr>
<tr class="tr-decade-0" data-decade="0" data-search_val="usdchf usd/chf">
<td class="id-6"><a class="e-info" href="/markets/usdchf">
<div class="image" style='background-image:url("https://etoro-cdn.etorostatic.com/market-avatars/usd-chf/150x150.png")'></div>
<div class="details">
<div class="symbol">USDCHF</div>
<div class="name">USD/CHF</div>
</div>
</a></td><td data-th="Spread"><span class="spread_num">1.5</span> Pips</td></tr>
<tr class="tr-decade-0" data-decade="0" data-search_val="nzdusd nzd/usd">
<td class="id-3"><a class="e-info" href="/markets/nzdusd">
<div class="image" style='background-image:url("https://etoro-cdn.etorostatic.com/market-avatars/nzd-usd/150x150.png")'></div>
<div class="details">
<div class="symbol">NZDUSD</div>
<div class="name">NZD/USD</div>
</div>
</a></td><td data-th="Spread"><span class="spread_num">2.5</span> Pips</td></tr>
<tr class="tr-decade-0" data-decade="0" data-search_val="eurgbp eur/gbp">
<td class="id-8"><a class="e-info" href="/markets/eurgbp">
<div class="image" style='background-image:url("https://etoro-cdn.etorostatic.com/market-avatars/eur-gbp/150x150.png")'></div>
<div class="details">
<div class="symbol">EURGBP</div>
<div class="name">EUR/GBP</div>
</div>
</a></td><td data-th="Spread"><span class="spread_num">1.5</span> Pips</td></tr>
<tr class="tr-decade-0" data-decade="0" data-search_val="eurjpy eur/jpy">
<td class="id-10"><a class="e-info" href="/markets/eurjpy">
<div class="image" style='background-image:url("https://etoro-cdn.etorostatic.com/market-avatars/eur-jpy/150x150.png")'></div>
<div class="details">
<div class="symbol">EURJPY</div>
<div class="name">EUR/JPY</div>
</div>
</a></td><td data-th="Spread"><span class="spread_num">2</span> Pips</td></tr>
<tr class="tr-decade-0" data-decade="0" data-search_val="gbpjpy gbp/jpy">
<td class="id-11"><a class="e-info" href="/markets/gbpjpy">
<div class="image" style='background-image:url("https://etoro-cdn.etorostatic.com/market-avatars/gbp-jpy/150x150.png")'></div>
<div class="details">
<div class="symbol">GBPJPY</div>
<div class="name">GBP/JPY</div>
</div>
</a></td><td data-th="Spread"><span class="spread_num">3</span> Pips</td></tr>
<tr class="tr-decade-0" data-decade="0" data-search_val="audjpy aud/jpy">
<td class="id-14"><a class="e-info" href="/markets/audjpy">
<div class="image" style='background-image:url("https://etoro-cdn.etorostatic.com/market-avatars/aud-jpy/150x150.png")'></div>
<div class="details">
<div class="symbol">AUDJPY</div>
<div class="name">AUD/JPY</div>
</div>
</a></td><td data-th="Spread"><span class="spread_num">2</span> Pips</td></tr>
<tr class="tr-decade-0" data-decade="0" data-search_val="audusd aud/usd">
<td class="id-7"><a class="e-info" href="/markets/audusd">
<div class="image" style='background-image:url("https://etoro-cdn.etorostatic.com/market-avatars/aud-usd/150x150.png")'></div>
<div class="details">
<div class="symbol">AUDUSD</div>
<div class="name">AUD/USD</div>
</div>
</a></td><td data-th="Spread"><span class="spread_num">1</span> Pips</td></tr>
<tr class="tr-decade-1" data-decade="1" data-search_val="eurchf eur/chf">
<td class="id-9"><a class="e-info" href="/markets/eurchf">
<div class="image" style='background-image:url("https://etoro-cdn.etorostatic.com/market-avatars/eur-chf/150x150.png")'></div>
<div class="details">
<div class="symbol">EURCHF</div>
<div class="name">EUR/CHF</div>
</div>
</a></td><td data-th="Spread"><span class="spread_num">5</span> Pips</td></tr>
<tr class="tr-decade-1" data-decade="1" data-search_val="euraud eur/aud">
<td class="id-12"><a class="e-info" href="/markets/euraud">
<div class="image" style='background-image:url("https://etoro-cdn.etorostatic.com/market-avatars/eur-aud/150x150.png")'></div>
<div class="details">
<div class="symbol">EURAUD</div>
<div class="name">EUR/AUD</div>
</div>
</a></td><td data-th="Spread"><span class="spread_num">7</span> Pips</td></tr>
<tr class="tr-decade-1" data-decade="1" data-search_val="eurcad eur/cad">
<td class="id-13"><a class="e-info" href="/markets/eurcad">
<div class="image" style='background-image:url("https://etoro-cdn.etorostatic.com/market-avatars/eur-cad/150x150.png")'></div>
<div class="details">
<div class="symbol">EURCAD</div>
<div class="name">EUR/CAD</div>
</div>
</a></td><td data-th="Spread"><span class="spread_num">7</span> Pips</td></tr>
<tr class="tr-decade-1" data-decade="1" data-search_val="cadjpy cad/jpy">
<td class="id-15"><a class="e-info" href="/markets/cadjpy">
<div class="image" style='background-image:url("https://etoro-cdn.etorostatic.com/market-avatars/cad-jpy/150x150.png")'></div>
<div class="details">
<div class="symbol">CADJPY</div>
<div class="name">CAD/JPY</div>
</div>
</a></td><td data-th="Spread"><span class="spread_num">6</span> Pips</td></tr>
<tr class="tr-decade-1" data-decade="1" data-search_val="chfjpy chf/jpy">
<td class="id-16"><a class="e-info" href="/markets/chfjpy">
<div class="image" style='background-image:url("https://etoro-cdn.etorostatic.com/market-avatars/chf-jpy/150x150.png")'></div>
<div class="details">
<div class="symbol">CHFJPY</div>
<div class="name">CHF/JPY</div>
</div>
</a></td><td data-th="Spread"><span class="spread_num">6</span> Pips</td></tr>
<tr class="tr-decade-1" data-decade="1" data-search_val="usdhkd usd/hkd">
<td class="id-39"><a class="e-info" href="/markets/usdhkd">
<div class="image" style='background-image:url("https://etoro-cdn.etorostatic.com/market-avatars/usd-hkd/150x150.png")'></div>
<div class="details">
<div class="symbol">USDHKD</div>
<div class="name">USD/HKD</div>
</div>
</a></td><td data-th="Spread"><span class="spread_num">5</span> Pips</td></tr>
<tr class="tr-decade-1" data-decade="1" data-search_val="usdzar usd/zar">
<td class="id-42"><a class="e-info" href="/markets/usdzar">
<div class="image" style='background-image:url("https://etoro-cdn.etorostatic.com/market-avatars/usd-zar/150x150.png")'></div>
<div class="details">
<div class="symbol">USDZAR</div>
<div class="name">USD/ZAR</div>
</div>
</a></td><td data-th="Spread"><span class="spread_num">50</span> Pips</td></tr>
<tr class="tr-decade-1" data-decade="1" data-search_val="usdrub usd/rub">
<td class="id-44"><a class="e-info" href="/markets/usdrub">
<div class="image" style='background-image:url("https://etoro-cdn.etorostatic.com/market-avatars/usd-rub/150x150.png")'></div>
<div class="details">
<div class="symbol">USDRUB</div>
<div class="name">USD/RUB</div>
</div>
</a></td><td data-th="Spread"><span class="spread_num">3</span> Pips</td></tr>
<tr class="tr-decade-1" data-decade="1" data-search_val="usdcnh usd/cnh">
<td class="id-45"><a class="e-info" href="/markets/usdcnh">
<div class="image" style='background-image:url("https://etoro-cdn.etorostatic.com/market-avatars/usd-cnh/150x150.png")'></div>
<div class="details">
<div class="symbol">USDCNH</div>
<div class="name">USD/CNH</div>
</div>
</a></td><td data-th="Spread"><span class="spread_num">10</span> Pips</td></tr>
<tr class="tr-decade-1" data-decade="1" data-search_val="audchf aud/chf">
<td class="id-46"><a class="e-info" href="/markets/audchf">
<div class="image" style='background-image:url("https://etoro-cdn.etorostatic.com/market-avatars/aud-chf/150x150.png")'></div>
<div class="details">
<div class="symbol">AUDCHF</div>
<div class="name">AUD/CHF</div>
</div>
</a></td><td data-th="Spread"><span class="spread_num">4</span> Pips</td></tr>
<tr class="tr-decade-2" data-decade="2" data-search_val="audcad aud/cad">
<td class="id-47"><a class="e-info" href="/markets/audcad">
<div class="image" style='background-image:url("https://etoro-cdn.etorostatic.com/market-avatars/aud-cad/150x150.png")'></div>
<div class="details">
<div class="symbol">AUDCAD</div>
<div class="name">AUD/CAD</div>
</div>
</a></td><td data-th="Spread"><span class="spread_num">4</span> Pips</td></tr>
<tr class="tr-decade-2" data-decade="2" data-search_val="audnzd aud/nzd">
<td class="id-48"><a class="e-info" href="/markets/audnzd">
<div class="image" style='background-image:url("https://etoro-cdn.etorostatic.com/market-avatars/aud-nzd/150x150.png")'></div>
<div class="details">
<div class="symbol">AUDNZD</div>
<div class="name">AUD/NZD</div>
</div>
</a></td><td data-th="Spread"><span class="spread_num">4</span> Pips</td></tr>
<tr class="tr-decade-2" data-decade="2" data-search_val="eurnzd eur/nzd">
<td class="id-49"><a class="e-info" href="/markets/eurnzd">
<div class="image" style='background-image:url("https://etoro-cdn.etorostatic.com/market-avatars/eur-nzd/150x150.png")'></div>
<div class="details">
<div class="symbol">EURNZD</div>
<div class="name">EUR/NZD</div>
</div>
</a></td><td data-th="Spread"><span class="spread_num">4</span> Pips</td></tr>
<tr class="tr-decade-2" data-decade="2" data-search_val="gbpaud gbp/aud">
<td class="id-50"><a class="e-info" href="/markets/gbpaud">
<div class="image" style='background-image:url("https://etoro-cdn.etorostatic.com/market-avatars/gbp-aud/150x150.png")'></div>
<div class="details">
<div class="symbol">GBPAUD</div>
<div class="name">GBP/AUD</div>
</div>
</a></td><td data-th="Spread"><span class="spread_num">4</span> Pips</td></tr>
<tr class="tr-decade-2" data-decade="2" data-search_val="gbpchf gbp/chf">
<td class="id-51"><a class="e-info" href="/markets/gbpchf">
<div class="image" style='background-image:url("https://etoro-cdn.etorostatic.com/market-avatars/gbp-chf/150x150.png")'></div>
<div class="details">
<div class="symbol">GBPCHF</div>
<div class="name">GBP/CHF</div>
</div>
</a></td><td data-th="Spread"><span class="spread_num">4</span> Pips</td></tr>
<tr class="tr-decade-2" data-decade="2" data-search_val="gbpnzd gbp/nzd">
<td class="id-52"><a class="e-info" href="/markets/gbpnzd">
<div class="image" style='background-image:url("https://etoro-cdn.etorostatic.com/market-avatars/gbp-nzd/150x150.png")'></div>
<div class="details">
<div class="symbol">GBPNZD</div>
<div class="name">GBP/NZD</div>
</div>
</a></td><td data-th="Spread"><span class="spread_num">4</span> Pips</td></tr>
<tr class="tr-decade-2" data-decade="2" data-search_val="nzdcad nzd/cad">
<td class="id-53"><a class="e-info" href="/markets/nzdcad">
<div class="image" style='background-image:url("https://etoro-cdn.etorostatic.com/market-avatars/nzd-cad/150x150.png")'></div>
<div class="details">
<div class="symbol">NZDCAD</div>
<div class="name">NZD/CAD</div>
</div>
</a></td><td data-th="Spread"><span class="spread_num">4</span> Pips</td></tr>
<tr class="tr-decade-2" data-decade="2" data-search_val="nzdchf nzd/chf">
<td class="id-54"><a class="e-info" href="/markets/nzdchf">
<div class="image" style='background-image:url("https://etoro-cdn.etorostatic.com/market-avatars/nzd-chf/150x150.png")'></div>
<div class="details">
<div class="symbol">NZDCHF</div>
<div class="name">NZD/CHF</div>
</div>
</a></td><td data-th="Spread"><span class="spread_num">4</span> Pips</td></tr>
<tr class="tr-decade-2" data-decade="2" data-search_val="nzdjpy nzd/jpy">
<td class="id-55"><a class="e-info" href="/markets/nzdjpy">
<div class="image" style='background-image:url("https://etoro-cdn.etorostatic.com/market-avatars/nzd-jpy/150x150.png")'></div>
<div class="details">
<div class="symbol">NZDJPY</div>
<div class="name">NZD/JPY</div>
</div>
</a></td><td data-th="Spread"><span class="spread_num">4</span> Pips</td></tr>
<tr class="tr-decade-2" data-decade="2" data-search_val="cadchf cad/chf">
<td class="id-56"><a class="e-info" href="/markets/cadchf">
<div class="image" style='background-image:url("https://etoro-cdn.etorostatic.com/market-avatars/cad-chf/150x150.png")'></div>
<div class="details">
<div class="symbol">CADCHF</div>
<div class="name">CAD/CHF</div>
</div>
</a></td><td data-th="Spread"><span class="spread_num">4</span> Pips</td></tr>
<tr class="tr-decade-3" data-decade="3" data-search_val="usdnok usd/nok">
<td class="id-57"><a class="e-info" href="/markets/usdnok">
<div class="image" style='background-image:url("https://etoro-cdn.etorostatic.com/market-avatars/usd-nok/150x150.png")'></div>
<div class="details">
<div class="symbol">USDNOK</div>
<div class="name">USD/NOK</div>
</div>
</a></td><td data-th="Spread"><span class="spread_num">20</span> Pips</td></tr>
<tr class="tr-decade-3" data-decade="3" data-search_val="usdsek usd/sek">
<td class="id-58"><a class="e-info" href="/markets/usdsek">
<div class="image" style='background-image:url("https://etoro-cdn.etorostatic.com/market-avatars/usd-sek/150x150.png")'></div>
<div class="details">
<div class="symbol">USDSEK</div>
<div class="name">USD/SEK</div>
</div>
</a></td><td data-th="Spread"><span class="spread_num">20</span> Pips</td></tr>
<tr class="tr-decade-3" data-decade="3" data-search_val="noksek nok/sek">
<td class="id-59"><a class="e-info" href="/markets/noksek">
<div class="image" style='background-image:url("https://etoro-cdn.etorostatic.com/market-avatars/nok-sek/150x150.png")'></div>
<div class="details">
<div class="symbol">NOKSEK</div>
<div class="name">NOK/SEK</div>
</div>
</a></td><td data-th="Spread"><span class="spread_num">20</span> Pips</td></tr>
<tr class="tr-decade-3" data-decade="3" data-search_val="eurnok eur/nok">
<td class="id-60"><a class="e-info" href="/markets/eurnok">
<div class="image" style='background-image:url("https://etoro-cdn.etorostatic.com/market-avatars/eur-nok/150x150.png")'></div>
<div class="details">
<div class="symbol">EURNOK</div>
<div class="name">EUR/NOK</div>
</div>
</a></td><td data-th="Spread"><span class="spread_num">20</span> Pips</td></tr>
<tr class="tr-decade-3" data-decade="3" data-search_val="eursek eur/sek">
<td class="id-61"><a class="e-info" href="/markets/eursek">
<div class="image" style='background-image:url("https://etoro-cdn.etorostatic.com/market-avatars/eur-sek/150x150.png")'></div>
<div class="details">
<div class="symbol">EURSEK</div>
<div class="name">EUR/SEK</div>
</div>
</a></td><td data-th="Spread"><span class="spread_num">30</span> Pips</td></tr>
<tr class="tr-decade-3" data-decade="3" data-search_val="usdtry usd/try">
<td class="id-62"><a class="e-info" href="/markets/usdtry">
<div class="image" style='background-image:url("https://etoro-cdn.etorostatic.com/market-avatars/usd-try/150x150.png")'></div>
<div class="details">
<div class="symbol">USDTRY</div>
<div class="name">USD/TRY</div>
</div>
</a></td><td data-th="Spread"><span class="spread_num">50</span> Pips</td></tr>
<tr class="tr-decade-3" data-decade="3" data-search_val="usdmxn usd/mxn">
<td class="id-63"><a class="e-info" href="/markets/usdmxn">
<div class="image" style='background-image:url("https://etoro-cdn.etorostatic.com/market-avatars/usd-mxn/150x150.png")'></div>
<div class="details">
<div class="symbol">USDMXN</div>
<div class="name">USD/MXN</div>
</div>
</a></td><td data-th="Spread"><span class="spread_num">20</span> Pips</td></tr>
<tr class="tr-decade-3" data-decade="3" data-search_val="usdsgd usd/sgd">
<td class="id-64"><a class="e-info" href="/markets/usdsgd">
<div class="image" style='background-image:url("https://etoro-cdn.etorostatic.com/market-avatars/usd-sgd/150x150.png")'></div>
<div class="details">
<div class="symbol">USDSGD</div>
<div class="name">USD/SGD</div>
</div>
</a></td><td data-th="Spread"><span class="spread_num">3</span> Pips</td></tr>
<tr class="tr-decade-3" data-decade="3" data-search_val="gbpcad gbp/cad">
<td class="id-65"><a class="e-info" href="/markets/gbpcad">
<div class="image" style='background-image:url("https://etoro-cdn.etorostatic.com/market-avatars/gbp-cad/150x150.png")'></div>
<div class="details">
<div class="symbol">GBPCAD</div>
<div class="name">GBP/CAD</div>
</div>
</a></td><td data-th="Spread"><span class="spread_num">4</span> Pips</td></tr>
<tr class="tr-decade-3" data-decade="3" data-search_val="zarjpy zar/jpy">
<td class="id-66"><a class="e-info" href="/markets/zarjpy">
<div class="image" style='background-image:url("https://etoro-cdn.etorostatic.com/market-avatars/zar-jpy/150x150.png")'></div>
<div class="details">
<div class="symbol">ZARJPY</div>
<div class="name">ZAR/JPY</div>
</div>
</a></td><td data-th="Spread"><span class="spread_num">8</span> Pips</td></tr>
<tr class="tr-decade-4" data-decade="4" data-search_val="eurpln eur/pln">
<td class="id-68"><a class="e-info" href="/markets/eurpln">
<div class="image" style='background-image:url("https://etoro-cdn.etorostatic.com/market-avatars/eur-pln/150x150.png")'></div>
<div class="details">
<div class="symbol">EURPLN</div>
<div class="name">EUR/PLN</div>
</div>
</a></td><td data-th="Spread"><span class="spread_num">30</span> Pips</td></tr>
<tr class="tr-decade-4" data-decade="4" data-search_val="usdhuf usd/huf">
<td class="id-69"><a class="e-info" href="/markets/usdhuf">
<div class="image" style='background-image:url("https://etoro-cdn.etorostatic.com/market-avatars/usd-huf/150x150.png")'></div>
<div class="details">
<div class="symbol">USDHUF</div>
<div class="name">USD/HUF</div>
</div>
</a></td><td data-th="Spread"><span class="spread_num">20</span> Pips</td></tr>
<tr class="tr-decade-4" data-decade="4" data-search_val="eurhuf eur/huf">
<td class="id-70"><a class="e-info" href="/markets/eurhuf">
<div class="image" style='background-image:url("https://etoro-cdn.etorostatic.com/market-avatars/eur-huf/150x150.png")'></div>
<div class="details">
<div class="symbol">EURHUF</div>
<div class="name">EUR/HUF</div>
</div>
</a></td><td data-th="Spread"><span class="spread_num">20</span> Pips</td></tr>
<tr class="tr-decade-4" data-decade="4" data-search_val="gbphuf gbp/huf">
<td class="id-71"><a class="e-info" href="/markets/gbphuf">
<div class="image" style='background-image:url("https://etoro-cdn.etorostatic.com/market-avatars/gbp-huf/150x150.png")'></div>
<div class="details">
<div class="symbol">GBPHUF</div>
<div class="name">GBP/HUF</div>
</div>
</a></td><td data-th="Spread"><span class="spread_num">30</span> Pips</td></tr>
<tr class="tr-decade-4" data-decade="4" data-search_val="chfhuf chf/huf">
<td class="id-72"><a class="e-info" href="/markets/chfhuf">
<div class="image" style='background-image:url("https://etoro-cdn.etorostatic.com/market-avatars/chf-huf/150x150.png")'></div>
<div class="details">
<div class="symbol">CHFHUF</div>
<div class="name">CHF/HUF</div>
</div>
</a></td><td data-th="Spread"><span class="spread_num">30</span> Pips</td></tr>
<tr class="tr-decade-4" data-decade="4" data-search_val="usdpln usd/pln">
<td class="id-73"><a class="e-info" href="/markets/usdpln">
<div class="image" style='background-image:url("https://etoro-cdn.etorostatic.com/market-avatars/usd-pln/150x150.png")'></div>
<div class="details">
<div class="symbol">USDPLN</div>
<div class="name">USD/PLN</div>
</div>
</a></td><td data-th="Spread"><span class="spread_num">20</span> Pips</td></tr></tbody>
</table>

我的第一個想法是為我的每個組使用 for 循環來識別表,使用soup.find ,然后使用 for 循環來搜索每個資產的匹配tr並從td中提取相關數字。 然而,網站結構讓我很困惑,所以我在編碼時遇到了麻煩。 有人可以幫忙嗎?

讓您的生活更輕松,只需將 HTML 轉儲到HTML pandas 所有的表都在那里,您可以使用for loop或通過indexing訪問它們。

就是這樣:

import pandas as pd
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from tabulate import tabulate

options = Options()
options.headless = False
driver = webdriver.Chrome(options=options)

driver.get("https://www.etoro.com/trading/fees/#cfds")
df = pd.read_html(driver.page_source, flavor="bs4")
print(tabulate(df[-9]))
driver.close()

樣品 output:

--  --------------------------------------------  ------------  ------------
 0  BTC Bitcoin                                   $ -11.813292  $ 0
 1  ETHEREUM Ethereum                             $ -0.300336   $ 0
 2  BCH Bitcoin Cash                              $ -0.138758   $ 0
 3  XRP Ripple                                    $ -0.00009    $ 0
 4  DASH Dash                                     $ -0.040001   $ 0
 5  LTC Litecoin                                  $ -0.049917   $ 0
 6  ETC Ethereum Classic                          $ -0.0023     $ 0
 7  ADA Cardano                                   $ -0.000072   $ 0
 8  MIOTA IOTA                                    $ -0.000119   $ 0
 9  XLM Stellar                                   $ -0.000051   $ 0
10  EOS EOS                                       $ -0.001041   $ 0
11  NEO NEO                                       $ -0.005731   $ 0
12  TRX TRON                                      $ -0.000011   $ 0
13  ZEC ZCASH                                     $ -0.02535    $ 0
14  BNB Binance Coin                              $ -0.014941   $ 0
15  XTZ Tezos                                     $ -0.000813   $ 0
16  LINK Chainlink                                $ -0.004082   $ 0
17  UNI Uniswap                                   $ -0.000816   $ 0
18  DOGE Dogecoin                                 $ -0.000163   $ -0.000163
19  BTCEUR Bitcoin/Euro                           $ -11.813292  $ 0
20  ETHEUR Ethereum/Euro                          $ -0.300336   $ 0

and so on ...

注意:對我來說,在headless模式下運行會給我一個驗證碼。 因此,開啟了頭部模式。

首先,下面的代碼展開了所有的價差列表,然后繼續為每一個點擊“加載更多”按鈕,直到所有的記錄都顯示出來。 但是,展開列表和單擊加載更多都不會改變頁面上的整體元素結構,因此您可以跳過此步驟並繼續執行下一個實際抓取數據並從groupsassets中生成結果的代碼塊:

from bs4 import BeautifulSoup as soup
from selenium import webdriver
d = webdriver.Chrome('/path/to/chromedriver')
d.get('https://www.etoro.com/trading/fees/#cfds')
d.execute_script("""
   for (var i of document.querySelectorAll('span.expand-toggle')){
      i.click()
}""")
visible = d.execute_script("""
  return Array.from(document.querySelectorAll('button.show-more')).map(function(x){
      var style = window.getComputedStyle(x);
      return style.getPropertyValue('display');
   });
""")
while 'block' in visible:
   d.execute_script("""
   for (var i of document.querySelectorAll('button.show-more')){
      i.click()
   }""")
   visible = d.execute_script("""
      return Array.from(document.querySelectorAll('button.show-more')).map(function(x){
        var style = window.getComputedStyle(x);
        return style.getPropertyValue('display');
     });
   """)

現在,抓取表格(如果運行上述塊,則展開):

import re
page = soup(d.page_source, 'html.parser')
blocks = [i for i in page.select('div.page_single_tab.tab_2 div.e-row') if (k:=i['class'][-1].split('-')[-1]).isdigit() and 3 <= int(k) <= 8]
groups = ['Currencies', 'Commodities', 'Indices']
assets = {'Currencies': ['eur/usd','gbp/usd', 'usd/jpy'], 'Commodities': ['gold', 'silver', 'copper'], 'Indices': ['spx50', 'dj30']}
vals = {i.select_one('.expand-title-1').text: \
         {j.text.lower():k.select_one('.spread_num').text 
          for k in i.select('div.expand-content tr') if (j:=k.select_one('.name'))}
       for i in blocks}
result = {a:{i:vals[a].get(i) for i in b} for a, b in assets.items()}

Output:

{'Currencies': {'eur/usd': '1', 'gbp/usd': '2', 'usd/jpy': '1'}, 'Commodities': {'gold': '45', 'silver': '5', 'copper': '2'}, 'Indices': {'spx50': None, 'dj30': '600'}}

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM