简体   繁体   English

Python将BeautifulSoup输出转换为Set?

[英]Python Convert BeautifulSoup Output to Set?

In python I have:在python中我有:

def tag_visible(element):
    return True


def get_visible_text(soup):
    text_tags = soup.find_all(text=True)
    visible_texts = filter(tag_visible, text_tags)
    stripped = set()
    for text in visible_texts:
        stripped.add(text.strip())
    return stripped

I have 2 questions:我有两个问题:

  1. How to convert visible_texts into set in one line?如何将visible_texts转换为一行

  2. Is there a data structure in python like set (no duplicates) and preserves order of elements? python中是否有类似set(无重复)的数据结构并保留元素的顺序?


UPDATE:更新:

I can do:我可以:

return set(visible_texts)

But how to apply strip function ?但是如何应用条带function

dict s preserve insertion order. dict保留插入顺序。 dict s contain key-value pairs. dict包含键值对。 In this case, you don't care about the value, so it's always set to True .在这种情况下,您不关心该值,因此它始终设置为True

I'm not too sure what you are trying to achieve by using filter with a function that always returns True .我不太确定您通过使用带有始终返回True的函数的filter来实现什么。 Please clarify.请说清楚。

def get_visible_text(soup):
    text_tags = soup.find_all(text=True)
    return dict((text.strip(), True) for text in text_tags)

You can apply the strip function by using a set comprehension :您可以通过使用集合理解来应用 strip 函数:

return set(text.strip() for text in visible_texts)

Note, however, that the insertion order is not necessarily preserved.但是请注意,不一定要保留插入顺序。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM