简体   繁体   English

使用python获取带有美丽汤的字符串值

[英]get a value of string with beautiful soup using python

I've made a script that extracts texts from forum website, there's no problem with it but I want to get a value of the string which user posted, for example, see below我制作了一个从论坛网站提取文本的脚本,它没有问题,但我想获取用户发布的字符串的值,例如,见下文

s = "(Username[^\"]+)(?:<div>)"
r = requests.get("https://example.com/threads/73956/page2", headers=headers, cookies=cookies)
soup = BeautifulSoup(r.content, "html.parser")
result = re.findall(s, r.text)
print(result[0].replace("<br />", ""))

I want the value of string number which is 2我想要字符串编号的值是 2

<div class="wwCommentBody">
     <blockquote class="postcontent restore " style="padding: 10px;">Username: 
     leetibrahim<br>
    Number: 2       
     </blockquote>
</div>

This regex should yield the number string:这个正则表达式应该产生数字字符串:

n_str = r"(?<=Number:)(.+)"
r = requests.get("https://ffs.gg/threads/73956/page2", headers=headers, cookies=cookies)
soup = BeautifulSoup(r.content, "html.parser")
result = re.findall(n_str, r.text)
print(result[0])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM