简体   繁体   English

BeautifulSoup:从锚标签中提取 href

[英]BeautifulSoup: extract href from anchor tag

I've been reading other people methods but I can't connect the dots here and I need to extract the href: "/agente/listing/details/5828063" from the anchor tag.我一直在阅读其他人的方法,但我无法在这里连接点,我需要从锚标记中提取 href: "/agente/listing/details/5828063" 。 (2nd line of the code below) (下面代码的第二行)

The source page snippet goes as:源页面片段如下:

<div class="col-md-3" style="margin: 12px auto;">
  <a title="Abrir imóvel numa nova tab" data-bind="attr:{ href: '/agente/listing/details/' + ID }" target="_blank"
    href="/agente/listing/details/5828063">
    <span class="glyphicon glyphicon-new-window"></span>
  </a>

  <div class="discount-container loaded">

    <div data-bind="if: CampaingDescription"></div>

    <span data-bind="click: $parent.ShowListingDetails, attr:{ id: 'bkmimg' + ID }" style="cursor:pointer"
      id="bkmimg5828063">
      <!-- ko if: ListingPictureUrl != '' && ListingPictureUrl != null -->
      <img class="picture" data-bind="attr:{ src:ListingPictureUrl, 'data-original': ListingPictureUrl}"
        onerror="this.src='/agente/images/default-listing.png'"
        src="https://remaxpt-media.azurewebsites.net/images/listings/12204/122041118-203/L_07e3e17f83064ad2a228f234bf57b32a.jpg?w=160&amp;h=160"
        data-original="https://remaxpt-media.azurewebsites.net/images/listings/12204/122041118-203/L_07e3e17f83064ad2a228f234bf57b32a.jpg?w=160&amp;h=160">
      <!-- /ko -->
      <!-- ko if: ListingPictureUrl == '' || ListingPictureUrl == null -->
      <!-- /ko -->
    </span>
    <!-- ko if: MLS -->
    <!-- /ko -->
  </div>
</div>

You can get the attribute of the a tag using get()您可以使用get() a标签的属性

CODE:代码:

from bs4 import BeautifulSoup

html = """<div class="col-md-3" style="margin: 12px auto;">
  <a title="Abrir imóvel numa nova tab" data-bind="attr:{ href: '/agente/listing/details/' + ID }" target="_blank"
    href="/agente/listing/details/5828063">
    <span class="glyphicon glyphicon-new-window"></span>
  </a>

  <div class="discount-container loaded">

    <div data-bind="if: CampaingDescription"></div>

    <span data-bind="click: $parent.ShowListingDetails, attr:{ id: 'bkmimg' + ID }" style="cursor:pointer"
      id="bkmimg5828063">
      <!-- ko if: ListingPictureUrl != '' && ListingPictureUrl != null -->
      <img class="picture" data-bind="attr:{ src:ListingPictureUrl, 'data-original': ListingPictureUrl}"
       onerror="this.src='/agente/images/default-listing.png'"
        src="https://remaxpt-media.azurewebsites.net/images/listings/12204/122041118-203/L_07e3e17f83064ad2a228f234bf57b32a.jpg?w=160&amp;h=160"
        data-original="https://remaxpt-media.azurewebsites.net/images/listings/12204/122041118-203/L_07e3e17f83064ad2a228f234bf57b32a.jpg?w=160&amp;h=160">
      <!-- /ko -->
      <!-- ko if: ListingPictureUrl == '' || ListingPictureUrl == null -->
      <!-- /ko -->
    </span>
    <!-- ko if: MLS -->
    <!-- /ko -->
  </div>
</div>"""

soup = BeautifulSoup(html, "html.parser")
alink = soup.find('a')
print(alink.get('href'))

RESULT:结果:

/agente/listing/details/5828063
from bs4 import BeautifulSoup

html = """<div class="col-md-3" style="margin: 12px auto;">
  <a title="Abrir imóvel numa nova tab" data-bind="attr:{ href: '/agente/listing/details/' + ID }" target="_blank"
    href="/agente/listing/details/5828063">
    <span class="glyphicon glyphicon-new-window"></span>
  </a>

  <div class="discount-container loaded">

    <div data-bind="if: CampaingDescription"></div>

    <span data-bind="click: $parent.ShowListingDetails, attr:{ id: 'bkmimg' + ID }" style="cursor:pointer"
      id="bkmimg5828063">
      <!-- ko if: ListingPictureUrl != '' && ListingPictureUrl != null -->
      <img class="picture" data-bind="attr:{ src:ListingPictureUrl, 'data-original': ListingPictureUrl}"
        onerror="this.src='/agente/images/default-listing.png'"
        src="https://remaxpt-media.azurewebsites.net/images/listings/12204/122041118-203/L_07e3e17f83064ad2a228f234bf57b32a.jpg?w=160&amp;h=160"
        data-original="https://remaxpt-media.azurewebsites.net/images/listings/12204/122041118-203/L_07e3e17f83064ad2a228f234bf57b32a.jpg?w=160&amp;h=160">
      <!-- /ko -->
      <!-- ko if: ListingPictureUrl == '' || ListingPictureUrl == null -->
      <!-- /ko -->
    </span>
    <!-- ko if: MLS -->
    <!-- /ko -->
  </div>
</div>"""

soup = BeautifulSoup(html, 'html.parser')

target = soup.find("a", {'title': 'Abrir imóvel numa nova tab'}).get("href")
print(target)

Output:输出:

/agente/listing/details/5828063

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM