简体   繁体   English

BeautifulSoup - 从标签中获取文本,即使它内部有其他标签

[英]BeautifulSoup - Get text from tag even if it has other tags insise

Let's say that i have a following list:假设我有以下列表:

l = [<p>NC:<strong> 1</strong></p>, <p>APC<strong> 2</strong></p>, <p>GED<strong> 3</strong></p>]

and type of every element in that list is bs4.element.Tag并且该列表中每个元素的类型都是 bs4.element.Tag

What i want to get is a list that looks like this:我想要的是一个看起来像这样的列表:

ll = ['NC: 1','APC: 2','GED: 3']

What i tried to do is something like this:我试图做的是这样的:

ll = [element.get_text() for element in l]

But it returns:但它返回:

['NC:\xa01', 'APC:\xa02', 'GED:\xa03']

To me it looks like it has some problems with space between <strong></strong> .对我来说,它看起来在<strong></strong>之间有一些空间问题。 What is a way to make this right?有什么方法可以解决这个问题?

好的,我找到了答案,方法是:

ll = [entrance.get_text(strip=True) for entrance in l]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM