简体   繁体   中英

Web scraping with Scrapy and BeautifulSoup

<h2 class="result-item-name" data-nid="117" data-localisation="25.88872, -80.12488">
  Bal Harbour    </h2>

Hi everyone, I'm trying to collect the 'data-nid' and the 'data-localisation' but when I write my code :

'geocoordinates':['class','result-item-name','data-localisation']

I always get a None response.

Can you guys help me ? I am new to BeautifulSoup and I am not at ease with it.

Thank you very much !

In BeautifulSoup you can access the attributes using key-value. Just like a dictionary

Ex:

from bs4 import BeautifulSoup
html = """<h2 class="result-item-name" data-nid="117" data-localisation="25.88872, -80.12488">Bal Harbour    </h2>"""

soup = BeautifulSoup(html, "html.parser")
h2 = soup.find("h2", class_= "result-item-name")
print(h2["data-nid"])
print(h2["data-localisation"])

Output:

117
25.88872, -80.12488
from bs4 import BeautifulSoup

html = """
<h2 class="result-item-name" data-nid="117" data-localisation="25.88872, -80.12488">
Bal Harbour
</h2>
"""
soup = BeautifulSoup(html, 'html.parser')

h2 = soup.find('h2', {'class': 'result-item-name'})
print(h2['data-nid'])
print(h2['data-localisation'])

Output

117
25.88872, -80.12488

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM