简体   繁体   English

Python属性错误:“ NoneType”对象没有属性“ find_all”

[英]Python Attribute Error: 'NoneType' object has no attribute 'find_all'

I'm trying to get abbreviations of US states but this code: 我正在尝试获取美国各州的缩写,但是此代码为:

from bs4 import BeautifulSoup
from urllib.request import urlopen
url='https://simple.wikipedia.org/wiki/List_of_U.S._states'
web=urlopen(url)
source=BeautifulSoup(web, 'html.parser')
table=source.find('table', {'class': 'wikitable sortable jquery-tablesorter'})
abbs=table.find_all('b')
print(abbs.get_text())

returns AttributeError: 'Nonetype' object has no attribute 'find_all'. 返回AttributeError:'Nonetype'对象没有属性'find_all'。 What's the problem of my code? 我的代码有什么问题?

Here you go. 干得好。

I changed the class in source.find to 'wikitable sortable' . 我将source.find中的类更改为'wikitable sortable' Also, the method abbs.get_text() gave me an error, so I just used a generator function to get the text you wanted. 另外,方法abbs.get_text()给了我一个错误,所以我只是使用了生成器函数来获取想要的文本。

from bs4 import BeautifulSoup
from urllib.request import urlopen

web = urlopen('https://simple.wikipedia.org/wiki/List_of_U.S._states')
source = BeautifulSoup(web, 'lxml')
table = source.find(class_='wikitable sortable').find_all('b')
b_arr = '\n'.join([x.text for x in table])
print(b_arr)

Partial Output: 部分输出:

AL
AK
AZ
AR
CA
CO

As Patrick suggested, 正如Patrick所说,

source.first() returns only the first element. source.first()仅返回第一个元素。

Source code of first() method for the reference: first()方法的源代码供参考:

def find(self, name=None, attrs={}, recursive=True, text=None, **kwargs):
    """Return only the first child of this Tag matching the given criteria."""
    r = None
    l = self.find_all(name, attrs, recursive, text, 1, **kwargs)
    if l:
        r = l[0]
    return r
findChild = find

After extracting table it class name was wikitable sortable . 提取表后,它的类名是wikitable sortable
So as per above code, it was returning None . 因此,按照上面的代码,它返回None

So you may want to change your code as... 因此,您可能希望将代码更改为...

from bs4 import BeautifulSoup
from urllib.request import urlopen

url = 'https://simple.wikipedia.org/wiki/List_of_U.S._states'
web = urlopen(url)
source = BeautifulSoup(web, 'html.parser')

table = source.find('table', class_='wikitable')
abbs = table.find_all('b')

abbs_list = [i.get_text().strip() for i in abbs]
print(abbs_list)

I hope it'll answer your question. 希望它能回答您的问题。 :) :)

As suggested in the comments the HTML at the url doesn't have a table with the class 如注释中所建议,URL处的HTML没有带有该类的表

'wikitable sortable jquery-tablesorter'

But the class is actually 但实际上是

'wikitable sortable'

Also once you apply find_all, it returns a list containing all tags so you can't directly apply get_text() to it. 同样,一旦您应用find_all,它就会返回一个包含所有标签的列表,因此您不能直接将get_text()应用于它。 You can use list comprehension to strip out the text for each element in the list. 您可以使用列表理解来去除列表中每个元素的文本。 Here's the code which will work for your problem 这是适合您问题的代码

from bs4 import BeautifulSoup
from urllib.request import urlopen
url='https://simple.wikipedia.org/wiki/List_of_U.S._states'
web=urlopen(url)
source=BeautifulSoup(web, 'html.parser')
table=source.find('table', {'class': 'wikitable sortable'})
abbs=table.find_all('b')
values = [ele.text.strip() for ele in abbs]
print(values)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM