繁体   English   中英

Python:从html获取“id”或“data-value”?

[英]Python: obtaining "id" or "data-value" from html?

我试图从这块 html 中提取“id”或“数据值”并将它们分配给一个列表。 似乎我没有指定正确的目标。 我哪里出错了? 最终,我也想在 is_in_stock 部分中定位单个产品 ID。

我的代码 -

import requests
from bs4 import BeautifulSoup as bs

response = session.get(product_url)
soup = bs(response.text,'lxml')
div = soup.find("div",{"class":"item"})
all_sizes = div.find_all("data")

HTML-

                                                     <div class="product-options" id="product-options-wrapper">
<script type="text/javascript">
                    try {
                        var changeConfigurableStatus = true;
                        var stStatus = new StockStatus({"242":{"is_in_stock":true,"custom_status_icon":"","custom_status":"","product_id":"92964"},"246":{"is_in_stock":true,"custom_status_icon":"","custom_status":"","product_id":"92965"},"363":{"is_in_stock":true,"custom_status_icon":"","custom_status":"","product_id":"92966"},"248":{"is_in_stock":true,"custom_status_icon":"","custom_status":"","product_id":"92967"},"243":{"is_in_stock":true,"custom_status_icon":"","custom_status":"","product_id":"92968"},"368":{"is_in_stock":true,"custom_status_icon":"","custom_status":"","product_id":"92969"},"244":{"is_in_stock":true,"custom_status_icon":"","custom_status":"","product_id":"92970"},"247":{"is_in_stock":true,"custom_status_icon":"","custom_status":"","product_id":"92971"},"79":{"is_in_stock":true,"custom_status_icon":"","custom_status":"","product_id":"92972"},"249":{"is_in_stock":true,"custom_status_icon":"","custom_status":"","product_id":"92973"}});
                    }
                        catch(ex){}
                </script>
        <div class="configurable-product-option no-display">
        <div class="configurable-product-option-wrapper">
            <h2>Please select your size</h2>
            <div class="drop-select">
                <label for="attribute139"></label>
                <select name="super_attribute[139]"
                        id="attribute139"
                        class="required-entry super-attribute-select">
                    <option>Choose an Option...</option>
                </select>
            </div>
        </div>
    </div>
    <script type="text/javascript">
    var spConfig = new Product.Config({"attributes":{"139":{"id":"139","code":"eu_size","label":"EU ","options":[{"id":"242","label":"EU 40 2\/3 \/ US 7.5","price":"0","oldPrice":"0","products":["92964"]},{"id":"246","label":"EU 41 1\/3 \/ US 8","price":"0","oldPrice":"0","products":["92965"]},{"id":"363","label":"EU 42 \/ US 8.5","price":"0","oldPrice":"0","products":["92966"]},{"id":"248","label":"EU 42 2\/3 \/ US 9","price":"0","oldPrice":"0","products":["92967"]},{"id":"243","label":"EU 43 1\/3 \/ US 9.5","price":"0","oldPrice":"0","products":["92968"]},{"id":"368","label":"EU 44 \/ US 10","price":"0","oldPrice":"0","products":["92969"]},{"id":"244","label":"EU 44 2\/3 US 10.5","price":"0","oldPrice":"0","products":["92970"]},{"id":"247","label":"EU 45 1\/3 \/ US 11","price":"0","oldPrice":"0","products":["92971"]},{"id":"79","label":"EU 46 \/ US 11.5","price":"0","oldPrice":"0","products":["92972"]},{"id":"249","label":"EU 46 2\/3 \/ US 12","price":"0","oldPrice":"0","products":["92973"]}]}},"template":"\u20ac#{price}","basePrice":"89","oldPrice":"89","productId":"90522","chooseText":"Choose an Option...","taxConfig":{"includeTax":true,"showIncludeTax":true,"showBothPrices":false,"defaultTax":19,"currentTax":19,"inclTaxTitle":"Incl. Tax"}});
</script>

<h3>Choose size</h3>
<div class="clearfix " data-attribute="attribute139" >
                <div class="attribute-item "
        data-value="242">
        EU 40 2/3 / US 7.5        </div>
                <div class="attribute-item "
        data-value="246">
        EU 41 1/3 / US 8        </div>
                <div class="attribute-item "
        data-value="363">
        EU 42 / US 8.5        </div>
                <div class="attribute-item "
        data-value="248">
        EU 42 2/3 / US 9        </div>
                <div class="attribute-item "
        data-value="243">
        EU 43 1/3 / US 9.5        </div>
                <div class="attribute-item "
        data-value="368">
        EU 44 / US 10        </div>
                <div class="attribute-item "
        data-value="244">
        EU 44 2/3 US 10.5        </div>
                <div class="attribute-item "
        data-value="247">
        EU 45 1/3 / US 11        </div>
                <div class="attribute-item "
        data-value="79">
        EU 46 / US 11.5        </div>
                <div class="attribute-item "
        data-value="249">
        EU 46 2/3 / US 12        </div>
    </div>

这应该对你有用。

import requests
from bs4 import BeautifulSoup as bs

response = session.get(product_url)
soup = bs(response.text,'lxml')

div = soup.find_all("div",{"class":"attribute-item"})  # Select the divs with .attribute-item class
all_sizes = [x['data-value'] for x in div]  # Extract the 'data-value' attribute from all the divs with .attribute-item

你在正确的轨道上,但你需要tag.find_all而不是find

ids = []
for div in soup.find_all("div", {"class":"attribute-item"}):
    ids.append(x['data-value'])

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM