简体   繁体   中英

How to extract values with BeautifulSoup with no class

html code :

<td class="_480u">
    <div class="clearfix">
        <div>
            Female
        </div>
    </div>
</td>

I wanted the value "Female" as an output.

I tried bs.findAll('div',{'class':'clearfix'}) ; bs.findAll('tag',{'class':'_480u'}) But these classes are all over my html code and the output is a big list. I wanted to incorporate {td --> class = ".." and div --> class = ".."} in my search, so that I get the output as Female. How can I do this?

Thanks

Use stripped_strings property:

>>> from bs4 import BeautifulSoup
>>>
>>> html = '''<td class="_480u">
...     <div class="clearfix">
...         <div>
...             Female
...         </div>
...     </div>
... </td>'''
>>> soup = BeautifulSoup(html)
>>> print ' '.join(soup.find('div', {'class': 'clearfix'}).stripped_strings)
Female
>>> print ' '.join(soup.find('td', {'class': '_480u'}).stripped_strings)
Female

or specify class as empty string (or None ) and use string property:

>>> soup.find('div', {'class': ''}).string
u'\n            Female\n        '
>>> soup.find('div', {'class': ''}).string.strip()
u'Female'

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM