[英]How do I get the content inside window.data with beautiful soup and jsonify it so I can choose what key and value I want to print?
标题不知道怎么写,所以有点长。 随意编辑它。
我正在尝试从该站点抓取数据,但我无法弄清楚如何使用漂亮的汤访问“window.data”中的各个键和值。
例如,我想获取 yyuid、生日等的值。
代码是这样的:
import urllib.request
import urllib.error
from bs4 import BeautifulSoup
import re
username = "itsahardday"
url = "https://likee.video/@" + username # profile url - https://likee.video/account_name
def get_profile_html():
'''
Get profile data from HTML - https://likee.video/account_name
:return:
'''
response = urllib.request.urlopen(url)
soup = BeautifulSoup(response.read(), "html.parser")
results = soup.select_one("script:-soup-contains('userinfo')").string
print(results)
get_profile_html()
最好我希望它为 JSON,但欢迎任何解决方案。
在此先感谢您的帮助!
调整了你的代码。 从 function 返回。
import urllib.request
import urllib.error
from bs4 import BeautifulSoup
import re
username = "itsahardday"
url = "https://likee.video/@" + username # profile url - https://likee.video/account_name
def get_profile_html():
'''
Get profile data from HTML - https://likee.video/account_name
:return:
'''
response = urllib.request.urlopen(url)
soup = BeautifulSoup(response.read(), "html.parser")
results = soup.select_one("script:-soup-contains('userinfo')").string
print(results)
return results # add return
res=get_profile_html() # save the result
然后,转换为 JSON
import json # import
json.loads(res.split(";")[0].split("window.data =")[1])['userinfo']
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.