Python如何从列表内的字符串中删除字符

Question

I've been playing around with my code for quite some time now. 我已经玩了很长时间的代码了。 I wanna replace a string of text from the values returned by the each_div variable which returns a whole bunch of parsed values from a webpage. 我想从each_div变量返回的值中替换一串文本，该变量从网页返回一大堆已解析的值。

def scrape_page():
    create_dir(project_dir)
    page = 1
    max_page = 10
    while page < max_page:
        page = page + 1
        for each_div in soup.find_all('div',{'class':'username'}):
            f.write(str(each_div) + "\n")

If I run this code it will parse data from the username class from a html page. 如果我运行此代码，它将解析html页面中用户名类的数据。 The problem is that it returns it like this: 问题是它会像这样返回它：

<div class="username">someone_s_username</div>

What I've been trying todo is strip the <div class="username"> and </div> part away so it would only return the actual username instead of the html. 我一直想做的是将<div class="username">和</div>部分剥离掉，这样它只会返回实际的用户名而不是html。 If anyone have an idea on how to accomplish this that'll be terrific, thank you 如果有人对完成此操作有个好主意，谢谢

Answer 1

Sure, you can use Python's replace method: 当然，您可以使用Python的replace方法：

for each_div in soup.find_all('div',{'class':'username'}):
    each_div = each_div.replace('''<div class="username">''',"")
    each_div = each_div.replace("</div>","")
    f.write(str(each_div) + "\n")

Alternatively, you can split the string to obtain the part you want: 或者，您可以分割字符串以获得所需的部分：

for each_div in soup.find_all('div',{'class':'username'}):
    each_div = each_div.split(">")[1]  # everything after the first ">"
    each_div = each_div.split("<")[0]  # everything before the other "<"
    f.write(str(each_div) + "\n")

Oh, I just remembered, I believe you could be able to do simply this: 哦，我刚刚记得，我相信您可以做到这一点：

for each_div in soup.find_all('div',{'class':'username'}):
    f.write(str(each_div.text) + "\n")

Python如何从列表内的字符串中删除字符

问题描述

1 个解决方案

解决方案1
1 已采纳 2016-03-06 20:37:41

Python如何从列表内的字符串中删除字符

问题描述

1 个解决方案

解决方案1 1 已采纳 2016-03-06 20:37:41

解决方案1
1 已采纳 2016-03-06 20:37:41