简体   繁体   English

Python漂亮的汤去除多余的文字

[英]Python beautiful soup removing extra text

<div class="friendBlockContent">
                Bartdavy<br>
                <span class="friendSmallText">
        Online
                </span>
            </div>

is the html, and I tried 是html,我试过了

 for div in soup.findAll("div", class_="friendBlockContent", ):
     print(div)

And this gives me if he's online, I only wanna get the name, how could I do this? 如果他在线上,这给了我,我只想知道这个名字,我该怎么办?

div has two text node, you can access with .strings and use .stripped_strings to get clean data. div有两个文本节点,您可以使用.strings进行访问,并使用.stripped_strings获取干净的数据。 then unpack the two node with name and online field. 然后用nameonline字段解压缩两个节点。

In [50]:  for div in soup.findAll("div", class_="friendBlockContent", ):
    ...:      name, online = div.stripped_strings
    ...:     

In [51]: name
Out[51]: 'Bartdavy'

In [52]: online
Out[52]: 'Online'

A good way to achieve this: 实现此目的的好方法:

for div in soup.findAll("div",class_="friendBlockContent", ):
    print(div.contents[0])

You can use the following code if you can make sure that the structure is similar to the one you posted: 如果可以确保结构与您发布的结构相似,则可以使用以下代码:

for div in soup.findAll("div", class_="friendBlockContent", ):
     print(div.contents[0].strip())

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM