简体   繁体   中英

Python Attribute Error: 'NoneType' object has no attribute 'find_all'

I'm trying to get abbreviations of US states but this code:

from bs4 import BeautifulSoup
from urllib.request import urlopen
url='https://simple.wikipedia.org/wiki/List_of_U.S._states'
web=urlopen(url)
source=BeautifulSoup(web, 'html.parser')
table=source.find('table', {'class': 'wikitable sortable jquery-tablesorter'})
abbs=table.find_all('b')
print(abbs.get_text())

returns AttributeError: 'Nonetype' object has no attribute 'find_all'. What's the problem of my code?

Here you go.

I changed the class in source.find to 'wikitable sortable' . Also, the method abbs.get_text() gave me an error, so I just used a generator function to get the text you wanted.

from bs4 import BeautifulSoup
from urllib.request import urlopen

web = urlopen('https://simple.wikipedia.org/wiki/List_of_U.S._states')
source = BeautifulSoup(web, 'lxml')
table = source.find(class_='wikitable sortable').find_all('b')
b_arr = '\n'.join([x.text for x in table])
print(b_arr)

Partial Output:

AL
AK
AZ
AR
CA
CO

As Patrick suggested,

source.first() returns only the first element.

Source code of first() method for the reference:

def find(self, name=None, attrs={}, recursive=True, text=None, **kwargs):
    """Return only the first child of this Tag matching the given criteria."""
    r = None
    l = self.find_all(name, attrs, recursive, text, 1, **kwargs)
    if l:
        r = l[0]
    return r
findChild = find

After extracting table it class name was wikitable sortable .
So as per above code, it was returning None .

So you may want to change your code as...

from bs4 import BeautifulSoup
from urllib.request import urlopen

url = 'https://simple.wikipedia.org/wiki/List_of_U.S._states'
web = urlopen(url)
source = BeautifulSoup(web, 'html.parser')

table = source.find('table', class_='wikitable')
abbs = table.find_all('b')

abbs_list = [i.get_text().strip() for i in abbs]
print(abbs_list)

I hope it'll answer your question. :)

As suggested in the comments the HTML at the url doesn't have a table with the class

'wikitable sortable jquery-tablesorter'

But the class is actually

'wikitable sortable'

Also once you apply find_all, it returns a list containing all tags so you can't directly apply get_text() to it. You can use list comprehension to strip out the text for each element in the list. Here's the code which will work for your problem

from bs4 import BeautifulSoup
from urllib.request import urlopen
url='https://simple.wikipedia.org/wiki/List_of_U.S._states'
web=urlopen(url)
source=BeautifulSoup(web, 'html.parser')
table=source.find('table', {'class': 'wikitable sortable'})
abbs=table.find_all('b')
values = [ele.text.strip() for ele in abbs]
print(values)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM