简体   繁体   English

Python - 从可能返回None的函数设置变量的安全而优雅的方法

[英]Python - safe & elegant way to set a variable from function that may return None

I'm looking for a more elegant way of declaring a variable value where the function may return None and there are chained methods following the function call. 我正在寻找一种更优雅的方式来声明一个变量值,其中函数可以返回None并且在函数调用之后存在链接方法。

In the example below I am using BeautifulSoup to pass an HTML doc and if the element I am looking for is not found, the initial function call returns None . 在下面的示例中,我使用BeautifulSoup传递HTML文档,如果找不到我要查找的元素,则初始函数调用返回None The chained methods then break the code because .string is not a method of None object. 然后链接的方法会破坏代码,因为.string不是None对象的方法。

Which all makes sense, but I'm wondering if there's a cleaner way to write these variable declarations that won't break on a None value. 这一切都有意义,但我想知道是否有更简洁的方法来编写这些不会在None值上破坏的变量声明。

# I want to do something like this but it throws error if soup.find returns
# none because .string is not a method of None.
title = soup.find("h1", "article-title").string or "none"


# This works but is both ugly and inefficient
title = "none" if soup.find("h1", "article-title") is None else soup.find("h1", "article-title").string

# So instead I'm using this which feels clunky as well
title = soup.find("h1", "article-title")
title = "none" if title is None else title.string

Any better way? 有更好的方法吗?

I like Shashank's answer, but this might work for you as well: 我喜欢Shashank的回答,但这对你也有用:

class placeholder:
    string = "none"

title = (soup.find("h1", "article-title") or placeholder).string

This behavior of Beautiful Soup really annoys me as well. 美丽汤的这种行为也让我很烦。 Here's my solution: http://soupy.readthedocs.org/en/latest/ 这是我的解决方案: http//soupy.readthedocs.org/en/latest/

This smooths over lots of edge cases in BeautifulSoup, allowing you to write queries like 这可以平滑BeautifulSoup中的许多边缘情况,允许您编写类似的查询

dom.find('h1').find('h2').find('a')['href'].orelse('not found').val()

Which returns what you're looking for if it exists, or 'not found' otherwise. 如果存在,则返回您要查找的内容,否则返回“未找到”。

The general strategy in soupy is to wrap the data you care about in thin wrapper classes. 汤的一般策略是在瘦包装类中包装您关心的数据。 A simple example of such a wrapper: 这种包装器的一个简单示例:

class Scalar(object):
    def __init__(self, val):
        self._val = val
    def __getattr__(self, key):
        return Scalar(getattr(self._val, key, None))
    def __call__(self, *args, **kwargs):
        return Scalar(self._val(*args, **kwargs))
    def __str__(self):
        return 'Scalar(%s)' % self._val


s = Scalar('hi there')
s.upper()  # Scalar('HI THERE')
s.a.b.c.d  # Scalar(None)

If you want to be fancy about it, the mathematical property that lets you safely chain things forever is closure (ie methods return instances of the same type). 如果你想对它感兴趣,那么让你永远安全链接的数学属性就是闭包 (即方法返回相同类型的实例)。 Lots of BeautifulSoup methods don't have this property, which is what soupy addresses. 很多BeautifulSoup方法都没有这个属性,这就是什么样的地址。

You can use the getattr built-in function to provide a default value in case the desired attribute is not found within a given object: 如果在给定对象中找不到所需的属性,则可以使用getattr内置函数提供默认值:

title = getattr(soup.find("h1", "article-title"), "string", "none")

Alternatively, you can use a try statement : 或者,您可以使用try statement

try:
    title = soup.find("h1", "article-title").string
except AttributeError:
    title = "none"

The first method is more elegant in my opinion. 在我看来,第一种方法更优雅。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM