简体   繁体   English

如何使用 axios、reactjs 和 Cheerio 从此网页中提取此特定元素

[英]How do I extract this particular element from this webpage using axios, reactjs, and cheerio

I'm trying to extract live commodity prices from the CME group via web scraping using axios and cheerio.我正在尝试通过 web 使用 axios 和 Cheerio 从 CME 组中提取实时商品价格。 I'm having trouble finding the correct path for cheerio to get each element in the table I'm scraping.我很难找到让cheerio 获取我正在抓取的表格中每个元素的正确路径。 I'm trying right now to just get the Month JLY20 from the span tag for each row.我现在正在尝试从每行的 span 标签中获取 Month JLY20。

Link to the actual webpage: https://www.cmegroup.com/trading/metals/base/copper_quotes_settlements_futures.html链接到实际网页: https://www.cmegroup.com/trading/metals/base/copper_quotes_settlements_futures.html

Heres what I have right now:这是我现在拥有的:

Server.js服务器.js

  componentDidMount() {
    axios.get(`https://www.cmegroup.com/trading/metals/base/copper_quotes_settlements_futures.html`)
      .then(response => {
        if(response.status === 200)
          {
            const html = response.data;
            const $ = cheerio.load(html);
            let data = [];
            $('table.cmeTable').each((i, elem) => {
                console.log($(elem).find('span.noWrap').text())
                data.push({
                  Month: $(elem).find('th.cmeFixedColumn').text()
                  // title: $(elem).find('h2.entry-title').text(),
                  // excerpt: $(elem).find('p.hide_xxs').text().trim(),
                  // link: $(elem).find('h2.entry-title a').attr('href')
                })

            });
            console.log(data);
            // fs.writeFile('devtoList.json',
            // JSON.stringify(devtoListTrimmed, null, 4),
            // (err)=> console.log('File successfully written!'))
          }
        }, (error) => console.log('err') );
  }

Here's the target reference's source code:这是目标参考的源代码:

<div class="cmeTableBlockWrapper cmeContentSection cmeContentGroup" style=""><div class="cmeTableResponsiveScrollableWrapper">
<table id="settlementsFuturesProductTable" class="cmeTable" border="0" cellpadding="2" cellspacing="0" summary="Settlements Table">
    <thead>
        <tr>
            <th scope="col" class="invisibleElement cmeFixedColumn" style="height: 33px; width: 120px; min-width: 120px;">Month</th>
            <th scope="col">Open</th>
            <th scope="col">High</th>
            <th scope="col">Low</th>
            <th scope="col">Last</th>
            <th scope="col">Change</th>
            <th scope="col">Settle</th>
            <th scope="col">Estimated Volume</th>
            <th scope="col">Prior Day Open Interest</th>
        </tr>
    </thead>
    <tbody>
        <tr>
            <th scope="row" class="invisibleElement cmeFixedColumn" style="height: 41px; width: 120px;"><span class="noWrap">JLY 20</span></th>
            <td>2.8990</td>
            <td>2.9210</td>
            <td>2.8945</td>
            <td>2.9155</td>
            <td><span>-.0260</span></td>
            <td>2.9160</td>
            <td class="cmeTableRight">818</td>
            <td class="cmeTableRight">3,140</td>
        </tr>
        <tr>
            <th scope="row" class="invisibleElement cmeFixedColumn" style="height: 41px; width: 120px;"><span class="noWrap">AUG 20</span></th>
            <td>2.9105</td>
            <td>2.9330</td>
            <td>2.8980</td>
            <td>2.9270</td>
            <td><span>-.0245</span></td>
            <td>2.9250</td>
            <td class="cmeTableRight">191</td>
            <td class="cmeTableRight">2,994</td>
        </tr>
        <tr>
            <th scope="row" class="invisibleElement cmeFixedColumn" style="height: 41px; width: 120px;"><span class="noWrap">SEP 20</span></th>
            <td>2.9160</td>
            <td>2.9460</td>
            <td>2.8980</td>
            <td>2.9300</td>
            <td><span>-.0225</span></td>
            <td>2.9325</td>
            <td class="cmeTableRight">80,068</td>
            <td class="cmeTableRight">115,684</td>
        </tr>
        <tr>
            <th scope="row" class="invisibleElement cmeFixedColumn" style="height: 41px; width: 120px;"><span class="noWrap">OCT 20</span></th>
            <td>2.9350</td>
            <td>2.9400</td>
            <td>2.9280</td>
            <td>2.9400</td>
            <td><span>-.0220</span></td>
            <td>2.9405</td>
            <td class="cmeTableRight">10</td>
            <td class="cmeTableRight">2,012</td>
        </tr>
        <tr>
            <th scope="row" class="invisibleElement cmeFixedColumn" style="height: 41px; width: 120px;"><span class="noWrap">NOV 20</span></th>
            <td>2.9375</td>
            <td>2.9380</td>
            <td>2.9330</td>
            <td>2.9330</td>
            <td><span>-.0215</span></td>
            <td>2.9470</td>
            <td class="cmeTableRight">10</td>
            <td class="cmeTableRight">2,123</td>
        </tr>
        <tr>
            <th scope="row" class="invisibleElement cmeFixedColumn" style="height: 41px; width: 120px;"><span class="noWrap">DEC 20</span></th>
            <td>2.9340</td>
            <td>2.9630</td>
            <td>2.9150</td>
            <td>2.9480B</td>
            <td><span>-.0205</span></td>
            <td>2.9505</td>
            <td class="cmeTableRight">12,155</td>
            <td class="cmeTableRight">52,370</td>
        </tr>
        <tr>
            <th scope="row" class="invisibleElement cmeFixedColumn" style="height: 41px; width: 120px;"><span class="noWrap">JAN 21</span></th>
            <td>-</td>
            <td>-</td>
            <td>2.9465A</td>
            <td>2.9465A</td>
            <td><span>-.0195</span></td>
            <td>2.9560</td>
            <td class="cmeTableRight">4</td>
            <td class="cmeTableRight">592</td>
        </tr>
        <tr>
            <th scope="row" class="invisibleElement cmeFixedColumn" style="height: 41px; width: 120px;"><span class="noWrap">FEB 21</span></th>
            <td>-</td>
            <td>-</td>
            <td>2.9525A</td>
            <td>2.9525A</td>
            <td><span>-.0195</span></td>
            <td>2.9590</td>
            <td class="cmeTableRight">0</td>
            <td class="cmeTableRight">361</td>
        </tr>
        <tr>
            <th scope="row" class="invisibleElement cmeFixedColumn" style="height: 41px; width: 120px;"><span class="noWrap">MAR 21</span></th>
            <td>2.9535</td>
            <td>2.9720</td>
            <td>2.9300</td>
            <td>2.9590</td>
            <td><span>-.0185</span></td>
            <td>2.9615</td>
            <td class="cmeTableRight">8,055</td>
            <td class="cmeTableRight">31,345</td>
        </tr>
        <tr>
            <th scope="row" class="invisibleElement cmeFixedColumn" style="height: 41px; width: 120px;"><span class="noWrap">APR 21</span></th>
            <td>-</td>
            <td>-</td>
            <td>2.9575A</td>
            <td>2.9575A</td>
            <td><span>-.0175</span></td>
            <td>2.9650</td>
            <td class="cmeTableRight">0</td>
            <td class="cmeTableRight">181</td>
        </tr>
        <tr>
            <th scope="row" class="invisibleElement cmeFixedColumn" style="height: 41px; width: 120px;"><span class="noWrap">MAY 21</span></th>
            <td>2.9665</td>
            <td>2.9720</td>
            <td>2.9480</td>
            <td>2.9655B</td>
            <td><span>-.0165</span></td>
            <td>2.9655</td>
            <td class="cmeTableRight">1,619</td>
            <td class="cmeTableRight">6,208</td>
        </tr>
        <tr>
            <th scope="row" class="invisibleElement cmeFixedColumn" style="height: 41px; width: 120px;"><span class="noWrap">JUN 21</span></th>
            <td>-</td>
            <td>-</td>
            <td>2.9610A</td>
            <td>2.9610A</td>
            <td><span>-.0155</span></td>
            <td>2.9685</td>
            <td class="cmeTableRight">0</td>
            <td class="cmeTableRight">160</td>
        </tr>
        <tr>
            <th scope="row" class="invisibleElement cmeFixedColumn" style="height: 41px; width: 120px;"><span class="noWrap">JLY 21</span></th>
            <td>2.9585</td>
            <td>2.9755B</td>
            <td>2.9540</td>
            <td>2.9670B</td>
            <td><span>-.0155</span></td>
            <td>2.9690</td>
            <td class="cmeTableRight">471</td>
            <td class="cmeTableRight">934</td>
        </tr>
        <tr>
            <th scope="row" class="invisibleElement cmeFixedColumn" style="height: 41px; width: 120px;"><span class="noWrap">AUG 21</span></th>
            <td>-</td>
            <td>-</td>
            <td>2.9640A</td>
            <td>2.9640A</td>
            <td><span>-.0160</span></td>
            <td>2.9715</td>
            <td class="cmeTableRight">0</td>
            <td class="cmeTableRight">114</td>
        </tr>
        <tr>
            <th scope="row" class="invisibleElement cmeFixedColumn" style="height: 41px; width: 120px;"><span class="noWrap">SEP 21</span></th>
            <td>-</td>
            <td>-</td>
            <td>2.9635A</td>
            <td>2.9635A</td>
            <td><span>-.0155</span></td>
            <td>2.9720</td>
            <td class="cmeTableRight">4</td>
            <td class="cmeTableRight">437</td>
        </tr>
        <tr>
            <th scope="row" class="invisibleElement cmeFixedColumn" style="height: 41px; width: 120px;"><span class="noWrap">OCT 21</span></th>
            <td>-</td>
            <td>-</td>
            <td>2.9685A</td>
            <td>2.9685A</td>
            <td><span>-.0160</span></td>
            <td>2.9755</td>
            <td class="cmeTableRight">0</td>
            <td class="cmeTableRight">79</td>
        </tr>
        <tr>
            <th scope="row" class="invisibleElement cmeFixedColumn" style="height: 41px; width: 120px;"><span class="noWrap">NOV 21</span></th>
            <td>-</td>
            <td>-</td>
            <td>2.9720A</td>
            <td>2.9720A</td>
            <td><span>-.0160</span></td>
            <td>2.9760</td>
            <td class="cmeTableRight">0</td>
            <td class="cmeTableRight">33</td>
        </tr>
        <tr>
            <th scope="row" class="invisibleElement cmeFixedColumn" style="height: 41px; width: 120px;"><span class="noWrap">DEC 21</span></th>
            <td>2.9795</td>
            <td>2.9795</td>
            <td>2.9520A</td>
            <td>2.9680</td>
            <td><span>-.0155</span></td>
            <td>2.9765</td>
            <td class="cmeTableRight">65</td>
            <td class="cmeTableRight">1,065</td>
        </tr>
        <tr>
            <th scope="row" class="invisibleElement cmeFixedColumn" style="height: 41px; width: 120px;"><span class="noWrap">JAN 22</span></th>
            <td>-</td>
            <td>-</td>
            <td>-</td>
            <td>-</td>
            <td><span>-.0155</span></td>
            <td>2.9795</td>
            <td class="cmeTableRight">0</td>
            <td class="cmeTableRight">4</td>
        </tr>
        <tr>
            <th scope="row" class="invisibleElement cmeFixedColumn" style="height: 41px; width: 120px;"><span class="noWrap">FEB 22</span></th>
            <td>-</td>
            <td>-</td>
            <td>-</td>
            <td>-</td>
            <td><span>-.0145</span></td>
            <td>2.9820</td>
            <td class="cmeTableRight">0</td>
            <td class="cmeTableRight">0</td>
        </tr>
        <tr>
            <th scope="row" class="invisibleElement cmeFixedColumn" style="height: 41px; width: 120px;"><span class="noWrap">MAR 22</span></th>
            <td>-</td>
            <td>-</td>
            <td>-</td>
            <td>-</td>
            <td><span>-.0135</span></td>
            <td>2.9830</td>
            <td class="cmeTableRight">0</td>
            <td class="cmeTableRight">136</td>
        </tr>
        <tr>
            <th scope="row" class="invisibleElement cmeFixedColumn" style="height: 41px; width: 120px;"><span class="noWrap">APR 22</span></th>
            <td>-</td>
            <td>-</td>
            <td>-</td>
            <td>-</td>
            <td><span>-.0155</span></td>
            <td>2.9910</td>
            <td class="cmeTableRight">0</td>
            <td class="cmeTableRight">0</td>
        </tr>
        <tr>
            <th scope="row" class="invisibleElement cmeFixedColumn" style="height: 41px; width: 120px;"><span class="noWrap">MAY 22</span></th>
            <td>-</td>
            <td>-</td>
            <td>-</td>
            <td>-</td>
            <td><span>-.0145</span></td>
            <td>2.9905</td>
            <td class="cmeTableRight">0</td>
            <td class="cmeTableRight">5</td>
        </tr>
        <tr>
            <th scope="row" class="invisibleElement cmeFixedColumn" style="height: 41px; width: 120px;"><span class="noWrap">JUN 22</span></th>
            <td>-</td>
            <td>-</td>
            <td>-</td>
            <td>-</td>
            <td><span>-.0145</span></td>
            <td>2.9930</td>
            <td class="cmeTableRight">0</td>
            <td class="cmeTableRight">0</td>
        </tr>
        <tr>
            <th scope="row" class="invisibleElement cmeFixedColumn" style="height: 41px; width: 120px;"><span class="noWrap">JLY 22</span></th>
            <td>-</td>
            <td>-</td>
            <td>-</td>
            <td>-</td>
            <td><span>-.0145</span></td>
            <td>2.9935</td>
            <td class="cmeTableRight">0</td>
            <td class="cmeTableRight">20</td>
        </tr>
        <tr>
            <th scope="row" class="invisibleElement cmeFixedColumn" style="height: 41px; width: 120px;"><span class="noWrap">SEP 22</span></th>
            <td>-</td>
            <td>-</td>
            <td>-</td>
            <td>-</td>
            <td><span>-.0145</span></td>
            <td>2.9995</td>
            <td class="cmeTableRight">0</td>
            <td class="cmeTableRight">0</td>
        </tr>
        <tr>
            <th scope="row" class="invisibleElement cmeFixedColumn" style="height: 41px; width: 120px;"><span class="noWrap">DEC 22</span></th>
            <td>-</td>
            <td>-</td>
            <td>-</td>
            <td>-</td>
            <td><span>-.0145</span></td>
            <td>3.0030</td>
            <td class="cmeTableRight">0</td>
            <td class="cmeTableRight">25</td>
        </tr>
        <tr>
            <th scope="row" class="invisibleElement cmeFixedColumn" style="height: 41px; width: 120px;"><span class="noWrap">MAR 23</span></th>
            <td>-</td>
            <td>-</td>
            <td>-</td>
            <td>-</td>
            <td><span>-.0145</span></td>
            <td>3.0070</td>
            <td class="cmeTableRight">0</td>
            <td class="cmeTableRight">0</td>
        </tr>
        <tr>
            <th scope="row" class="invisibleElement cmeFixedColumn" style="height: 41px; width: 120px;"><span class="noWrap">MAY 23</span></th>
            <td>-</td>
            <td>-</td>
            <td>-</td>
            <td>-</td>
            <td><span>-.0145</span></td>
            <td>3.0095</td>
            <td class="cmeTableRight">0</td>
            <td class="cmeTableRight">0</td>
        </tr>
        <tr>
            <th scope="row" class="invisibleElement cmeFixedColumn" style="height: 41px; width: 120px;"><span class="noWrap">JLY 23</span></th>
            <td>-</td>
            <td>-</td>
            <td>-</td>
            <td>-</td>
            <td><span>-.0145</span></td>
            <td>3.0125</td>
            <td class="cmeTableRight">0</td>
            <td class="cmeTableRight">0</td>
        </tr>
        <tr>
            <th scope="row" class="invisibleElement cmeFixedColumn" style="height: 41px; width: 120px;"><span class="noWrap">SEP 23</span></th>
            <td>-</td>
            <td>-</td>
            <td>-</td>
            <td>-</td>
            <td><span>-.0145</span></td>
            <td>3.0150</td>
            <td class="cmeTableRight">0</td>
            <td class="cmeTableRight">0</td>
        </tr>
        <tr>
            <th scope="row" class="invisibleElement cmeFixedColumn" style="height: 41px; width: 120px;"><span class="noWrap">DEC 23</span></th>
            <td>-</td>
            <td>-</td>
            <td>-</td>
            <td>-</td>
            <td><span>-.0145</span></td>
            <td>3.0440</td>
            <td class="cmeTableRight">0</td>
            <td class="cmeTableRight">0</td>
        </tr>
        <tr>
            <th scope="row" class="invisibleElement cmeFixedColumn" style="height: 41px; width: 120px;"><span class="noWrap">MAR 24</span></th>
            <td>-</td>
            <td>-</td>
            <td>-</td>
            <td>-</td>
            <td><span>-.0145</span></td>
            <td>3.0445</td>
            <td class="cmeTableRight">0</td>
            <td class="cmeTableRight">0</td>
        </tr>
        <tr>
            <th scope="row" class="invisibleElement cmeFixedColumn" style="height: 41px; width: 120px;"><span class="noWrap">MAY 24</span></th>
            <td>-</td>
            <td>-</td>
            <td>-</td>
            <td>-</td>
            <td><span>-.0145</span></td>
            <td>3.0450</td>
            <td class="cmeTableRight">0</td>
            <td class="cmeTableRight">0</td>
        </tr>
        <tr>
            <th scope="row" class="invisibleElement cmeFixedColumn" style="height: 41px; width: 120px;"><span class="noWrap">JLY 24</span></th>
            <td>-</td>
            <td>-</td>
            <td>-</td>
            <td>-</td>
            <td><span>-.0145</span></td>
            <td>3.0455</td>
            <td class="cmeTableRight">0</td>
            <td class="cmeTableRight">0</td>
        </tr>
        <tr>
            <th scope="row" class="invisibleElement cmeFixedColumn" style="height: 41px; width: 120px;"><span class="noWrap">SEP 24</span></th>
            <td>-</td>
            <td>-</td>
            <td>-</td>
            <td>-</td>
            <td><span>-.0145</span></td>
            <td>3.0460</td>
            <td class="cmeTableRight">0</td>
            <td class="cmeTableRight">0</td>
        </tr>
        <tr>
            <th scope="row" class="invisibleElement cmeFixedColumn" style="height: 41px; width: 120px;"><span class="noWrap">DEC 24</span></th>
            <td>-</td>
            <td>-</td>
            <td>-</td>
            <td>-</td>
            <td><span>-.0145</span></td>
            <td>3.0465</td>
            <td class="cmeTableRight">0</td>
            <td class="cmeTableRight">0</td>
        </tr>
        <tr>
            <th scope="row" class="invisibleElement cmeFixedColumn" style="height: 41px; width: 120px;"><span class="noWrap">MAR 25</span></th>
            <td>-</td>
            <td>-</td>
            <td>-</td>
            <td>-</td>
            <td><span>-.0145</span></td>
            <td>3.0470</td>
            <td class="cmeTableRight">0</td>
            <td class="cmeTableRight">0</td>
        </tr>
        <tr>
            <th scope="row" class="invisibleElement cmeFixedColumn" style="height: 41px; width: 120px;"><span class="noWrap">MAY 25</span></th>
            <td>-</td>
            <td>-</td>
            <td>-</td>
            <td>-</td>
            <td><span>-.0145</span></td>
            <td>3.0475</td>
            <td class="cmeTableRight">0</td>
            <td class="cmeTableRight">0</td>
        </tr>
        <tr>
            <th scope="row" class="invisibleElement cmeFixedColumn" style="height: 41px; width: 120px;"><span class="noWrap">Total</span></th>
            <td></td>
            <td></td>
            <td></td>
            <td></td>
            <td><span></span></td>
            <td></td>
            <td class="cmeTableRight">103,470</td>
            <td class="cmeTableRight">220,022</td>
        </tr>
    </tbody>
</table>

Appreciate any help.感谢任何帮助。 Thanks.谢谢。

From reviewing the site you linked: the reason you are not able to select the content is because the data table is loaded asynchronously;从查看您链接的站点:您无法 select 内容的原因是因为数据表是异步加载的; this means your script executes before the HTML has renderered.这意味着您的脚本在 HTML 渲染之前执行。

If you open the devtools for the site you linked , you can see there is an asynchronous call to this endpoint .如果您打开您链接的站点的开发工具,您会看到有一个对该端点的异步调用。

A better strategy would be to collect the data from the URL I linked to above.更好的策略是从我上面链接的 URL 收集数据。

Edit: on further examination of the source code, you can get the data you need to construct the async URL from window.cmeComponents编辑:进一步检查源代码,您可以从window.cmeComponents获取构建异步 URL 所需的数据

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何抓取特定元素(Cheerio) - How to scrape a particular element (Cheerio) 如何隐藏网页中包含特定单词的特定HTML元素? - How do I hide a particular HTML element that contains a specific word from a webpage? "如何在 ReactJS 中使用 axios 从 json 数据(api)中获取特定数据" - How do i get specific data from a json data (api) using axios in ReactJS 如何使用 Cheerio 和 axios select 图像 url - How to select an image url using cheerio and axios Cheerio,axios,reactjs 到 web 从返回空列表的网页上刮下一张桌子 - Cheerio, axios, reactjs to web scrape a table off a webpage returning empty list 如何通过JavaScript / cheerio从以下html中提取文本? - How to extract text from the following html as I want by JavaScript / cheerio? 使用cheerio从特定的class中提取url - Extract url from a specific class using cheerio 怎么去掉空的<p>使用 JavaScript 或 Cheerio 的字符串中的标签? - How do I remove empty <p> tags from a string using JavaScript or Cheerio? 如何使用javascript从网页中删除HTML元素? (尝试进行Chrome扩展) - How do I remove an HTML element from a webpage using javascript? (Trying to make a chrome extension) 使用 Cheerio Js 到 select 具有相同 class 的特定元素 - Using Cheerio Js to select a particular element with same class
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM