Python和BeautifulSoup：如果条件正常，为什么我不这样做

Question

i'm trying to scrap a web page, using beautifulSoup, and i build a code that gets some informations from a table. 我正在尝试使用beautifulSoup抓取网页，并且构建了从表中获取一些信息的代码。 here is the code i'm working on but i have a problem with the if condition : 这是我正在处理的代码，但我对if条件有疑问：

p=soup_tab.find_all('tr')
j=0
for i in p:
 soup_tr = BeautifulSoup(str(i) ,'html.parser')
 if(soup_tr.find('span', 
{"id":"ctl0_CONTENU_PAGE_resultSearch_tableauResultSearch_ctl"+str(j)+
 "_reference"})):
       print("enter if 1 =======================")
       cons_intitule_ref= (soup_tr.find('span',
 {"id":"ctl0_CONTENU_PAGE_resultSearch_tableauResultSearch_ctl"+str(j)+
 "_reference"})).get_text()
       resultat.append(cons_intitule_ref)

the problem in my code is on the if condition, while executing the program there is no print of ("enter if 1 ========"). 我的代码中的问题是在if条件下，在执行程序时没有打印（“如果1 ========，请输入）。 and i'm sure that the tag i'm searching for is correct, i think the exact problem is on the condition(if); 并且我确定我要搜索的标签是正确的，我认为确切的问题是在条件上（如果）；

any help please, i'm stuck on this problem for hours, and still THANK YOU IN ADVANCE 请提供任何帮助，我在这个问题上停留了几个小时，仍然感谢您

Answer 1

Primary problem I see: you're starting your loop j=0 at 0 when it should start at 1 in-order to print your desired result. 主要问题，我看到：你开始你的循环j=0时0时，它应在开始1按顺序打印您所需的结果。

If this was your html (a condensed version of the actual page), and you're trying to get the text associated with this tag: 如果这是您的html（实际页面的精简版本），并且您正在尝试获取与此标记关联的文本：

html = '''<span class="ref" id="ctl0_CONTENU_PAGE_resultSearch_tableauResultSearch_ctl1_reference">01/AMI/RDOE/2017</span>
<span class="ref" id="ctl0_CONTENU_PAGE_resultSearch_tableauResultSearch_ctl2_reference">01/ct/2017</span>
<span class="ref" id="ctl0_CONTENU_PAGE_resultSearch_tableauResultSearch_ctl3_reference">108/2017/CNSS</span>
<span class="ref" id="ctl0_CONTENU_PAGE_resultSearch_tableauResultSearch_ctl4_reference">1/2017</span>
<span class="ref" id="ctl0_CONTENU_PAGE_resultSearch_tableauResultSearch_ctl5_reference">09/2017/CZC</span>
<span class="ref" id="ctl0_CONTENU_PAGE_resultSearch_tableauResultSearch_ctl6_reference">65/2017/TGR</span>
<span class="ref" id="ctl0_CONTENU_PAGE_resultSearch_tableauResultSearch_ctl7_reference">20/2017/DMSPK</span>
<span class="ref" id="ctl0_CONTENU_PAGE_resultSearch_tableauResultSearch_ctl8_reference">05/INDH/2017</span>
<span class="ref" id="ctl0_CONTENU_PAGE_resultSearch_tableauResultSearch_ctl9_reference">13/CS/2017</span>
<span class="ref" id="ctl0_CONTENU_PAGE_resultSearch_tableauResultSearch_ctl10_reference">158/2017/RRA</span>'''

You should use regex like so: 您应该像这样使用正则表达式：

import re
soup = BeautifulSoup(html, 'lxml')
for item in soup.findAll('span', {"id": re.compile(
    "ctl0_CONTENU_PAGE_resultSearch_tableauResultSearch_ctl\d+_reference")}):
    item.get_text()

Returns: 返回值：

'01/AMI/RDOE/2017'
'01/ct/2017'
'108/2017/CNSS'
'1/2017'
'09/2017/CZC'
'65/2017/TGR'
'20/2017/DMSPK'
'05/INDH/2017'
'13/CS/2017'
'158/2017/RRA'

Python和BeautifulSoup：如果条件正常，为什么我不这样做

问题描述

1 个解决方案

解决方案1
0 已采纳 2017-11-20 16:57:59

Python和BeautifulSoup：如果条件正常，为什么我不这样做

问题描述

1 个解决方案

解决方案1 0 已采纳 2017-11-20 16:57:59

解决方案1
0 已采纳 2017-11-20 16:57:59