简体   繁体   中英

Python Beautiful Soup extracting data within a div tag itself

I am trying to use Pythons beautifulSoup to pull data from an HTML file. The following line of HTML is the one I'm interested in.

<div class="myself" title="Name@email.com [11:07:27 AM]">
     <nobr>Name</nobr></div>

I want to extract the title (with the email and time stamp). I am able to access the class with...

find('div', attrs={'class':'myself'}))

I am able to print the entire contents of the div from there or the info in tags within the div, but I can't figure out how to get the title because it's within the same div tag

Attributes can be retrieved in a dictionary-like manner :

A tag may have any number of attributes. You can access a tag's attributes by treating the tag like a dictionary.

from bs4 import BeautifulSoup

soup = BeautifulSoup(data)
div = soup.find("div", class_="myself", title=True)
print(div["title"])

Use may this method

>>>import bs4
>>>html_string = "<div class="myself" title="Name@email.com [11:07:27 AM]">
 <nobr>Name</nobr></div>"
>>>title_string = bs4.BeautifulSoup(html_string).div.attrs['title']
>>>print(title_string)
'Name@email.com [11:07:27 AM]'

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM