[英]compare a python list with strings which are extracted from html using BeautifulSoup python3
i am working on a project in python using BeautifulSoup in which i am trying to extract data from html. 我正在使用BeautifulSoup在python中的项目中工作,我正在尝试从html提取数据。 i have already extracted some data but now i am facing some issues that is i have this list category=['author', 'pinglun', 'renqi', 'step', 'gongyi', 'nandu', 'renshu', 'kouwei', 'zbshijian', 'prshijian'] and in my html document, there is a value associated to each of the element in the above list.
我已经提取了一些数据,但是现在我遇到了一些问题,即我有以下列表类别= ['author','pinglun','renqi','step','gongyi','nandu','renshu', 'kouwei','zbshijian','prshijian'],并且在我的html文档中,有一个与上述列表中的每个元素相关联的值。 i have tried this code but this is just retrieving value of one element ie "author".
我试过这段代码,但这只是检索一个元素的值,即“作者”。 i want to extract values of all the elements in the above "category" list
我想提取上述“类别”列表中所有元素的值
for script in scripts:
if "_BFD.BFD_INFO" in script.text:
text=script.text
m_text=text.split('=')
m_text = text.split('=')
m_text = m_text[2].split(":")
m_text = m_text[1].split(',')
encoded = m_text[0].encode('utf-8')
print(encoded.decode('utf-8'))
category=['author', 'pinglun', 'renqi', 'step', 'gongyi', 'nandu',
'renshu', 'kouwei', 'zbshijian', 'prshijian']
for script in scripts:
text=script.text
m_text=text.split(',')
for n in m_text:
if 'author' in n:
print(n)
Just use a for
-loop to check for each category in the text-segment: 只需使用
for
-loop来检查文本段中的每个类别:
for script in scripts:
if "_BFD.BFD_INFO" in script.text:
text=script.text
m_text=text.split('=')
m_text = text.split('=')
m_text = m_text[2].split(":")
m_text = m_text[1].split(',')
encoded = m_text[0].encode('utf-8')
print(encoded.decode('utf-8'))
category=['author', 'pinglun', 'renqi', 'step', 'gongyi', 'nandu',
'renshu', 'kouwei', 'zbshijian', 'prshijian']
for script in scripts:
text=script.text
m_text=text.split(',')
for n in m_text:
for c in category:
if c in n:
print(n)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.