简体   繁体   English

Python代码在表格标签之间获取文本

[英]Python Code to get text between the table tags

I am new to Python, can someone help me in writing the code where I can get the data from the tags? 我是Python的新手,有人可以帮助我编写代码,以便从代码中获取数据吗?

The below is the table tag, also I need the data which is present in the table which is inside the first table: 以下是表格标记,我也需要第一个表格内的表格中存在的数据:

<table width="100%" border="0" cellspacing="0" cellpadding="0">
                  <tr> 
                    <td width="7" rowspan="2">&nbsp;</td>
                    <td width='40%'> <div align="left">

                      </div>
                    <td width="7" rowspan="2">&nbsp;</td>
                  </tr>
                  <tr> 
                    <td colspan="2"> 
    <b><font face='Arial, Helvetica, sans-serif' size='2'>Account #: 8428995632 </font></b><BR><TABLE BORDER='1' width='100%' align='center' cellspacing='0'><TR><td align='left' colspan='2'><font face='Arial, Helvetica, sans-serif' size='2'><b>Billing Date:   </b><BR>07-22-2013</font></TD><td align='left' ><font face='Arial, Helvetica, sans-serif' size='2'><b>Past Due Date:    </b><BR>08-12-2013</font></TD></TR><TR><td align='left'><font face='Arial, Helvetica, sans-serif' size='2'><b>Service From: </b><BR>06-11-2013</font></TD><td align='left'><font face='Arial, Helvetica, sans-serif' size='2'><b>Service To:    </b><BR>07-11-2013</font></TD><td align='left'><font face='Arial, Helvetica, sans-serif' size='2'><b>Days of Service: </b><BR>30</font></TD></TR><TR><td align='left' colspan='2'><font face='Arial, Helvetica, sans-serif' size='2'><b>Current Charges:    </b>$30,488.60</font></TD><td align='left' ><font face='Arial, Helvetica, sans-serif' size='2'><b>Amount Due:   </b>$30,488.60</font></TD></TR></TR></TABLE><p><p><p><p><CENTER><font face='Arial, Helvetica, sans-serif' size='3'><b> Meter readings for this bill:</b></font></CENTER><TABLE BORDER='1' width='100%' align='center' cellspacing='0'><TR bgcolor='#FFF2D7'><td align='center' width='18%'><font face='Arial,Helvetica,  sans-serif' size='2'><b>Meter</b></font></TD><td align='center' width='17%'><font face='Arial, Helvetica, sans-serif' size='2'><b>Service<br>From</b></font></TD><td align='center' width='17%'><font face='Arial, Helvetica, sans-serif' size='2'><b>Service<br>To</b></font></TD><td align='center' width='12%'><font face='Arial, Helvetica, sans-serif' size='2'><b># Days</b></font></TD><td align='center' width='10%'><font face='Arial, Helvetica, sans-serif' size='2'><b>Prior<br>Read</b></font></TD><td align='center' width='10%'><font face='Arial, Helvetica, sans-serif' size='2'><b>Current<br>Read</b></font></TD><td align='center' width='16%'><font face='Arial, Helvetica, sans-serif' size='2'><b>Consumption</b></font></TD><TR><td align='center' width='8%'><font face='Arial,Helvetica,sans-serif' size='2'>S10406906</FONT></TD><td align='center' width='18%'><font face='Arial, Helvetica, sans-serif' size='2'>06-11-2013</FONT></TD><td align='center' width='12%'><font face='Arial, Helvetica, sans-serif' size='2'>07-11-2013</FONT></TD><td align='center' width='8%'><font face='Arial, Helvetica, sans-serif' size='2'>30</FONT></TD><td align='center' width='16%'><font face='Arial, Helvetica, sans-serif' size='2'>134</FONT></TD><td align='center' width='22%'><font face='Arial, Helvetica, sans-serif' size='2'>144</FONT></TD><td align='center' width='16%'><font face='Arial, Helvetica, sans-serif' size='2'>10</FONT></TD></TR></FONT><TR><td align='center' width='8%'><font face='Arial,Helvetica,sans-serif' size='2'>08400002</FONT></TD><td align='center' width='18%'><font face='Arial, Helvetica, sans-serif' size='2'>06-11-2013</FONT></TD><td align='center' width='12%'><font face='Arial, Helvetica, sans-serif' size='2'>07-11-2013</FONT></TD><td align='center' width='8%'><font face='Arial, Helvetica, sans-serif' size='2'>30</FONT></TD><td align='center' width='16%'><font face='Arial, Helvetica, sans-serif' size='2'>30748</FONT></TD><td align='center' width='22%'><font face='Arial, Helvetica, sans-serif' size='2'>32634</FONT></TD><td align='center' width='16%'><font face='Arial, Helvetica, sans-serif' size='2'>1886</FONT></TD></TR></FONT><TR><td align='center' width='8%'><font face='Arial,Helvetica,sans-serif' size='2'>S10406911</FONT></TD><td align='center' width='18%'><font face='Arial, Helvetica, sans-serif' size='2'>06-11-2013</FONT></TD><td align='center' width='12%'><font face='Arial, Helvetica, sans-serif' size='2'>07-11-2013</FONT></TD><td align='center' width='8%'><font face='Arial, Helvetica, sans-serif' size='2'>30</FONT></TD><td align='center' width='16%'><font face='Arial, Helvetica, sans-serif' size='2'>2717</FONT></TD><td align='center' width='22%'><font face='Arial, Helvetica, sans-serif' size='2'>3046</FONT></TD><td align='center' width='16%'><font face='Arial, Helvetica, sans-serif' size='2'>329</FONT></TD></TR></FONT><TR><td align='center' width='8%'><font face='Arial,Helvetica,sans-serif' size='2'>08405704</FONT></TD><td align='center' width='18%'><font face='Arial, Helvetica, sans-serif' size='2'>06-11-2013</FONT></TD><td align='center' width='12%'><font face='Arial, Helvetica, sans-serif' size='2'>07-11-2013</FONT></TD><td align='center' width='8%'><font face='Arial, Helvetica, sans-serif' size='2'>30</FONT></TD><td align='center' width='16%'><font face='Arial, Helvetica, sans-serif' size='2'>23755</FONT></TD><td align='center' width='22%'><font face='Arial, Helvetica, sans-serif' size='2'>25100</FONT></TD><td align='center' width='16%'><font face='Arial, Helvetica, sans-serif' size='2'>1345</FONT></TD></TR></FONT><TR><td align='center' width='8%'><font face='Arial,Helvetica,sans-serif' size='2'>S10406895</FONT></TD><td align='center' width='18%'><font face='Arial, Helvetica, sans-serif' size='2'>06-11-2013</FONT></TD><td align='center' width='12%'><font face='Arial, Helvetica, sans-serif' size='2'>07-11-2013</FONT></TD><td align='center' width='8%'><font face='Arial, Helvetica, sans-serif' size='2'>30</FONT></TD><td align='center' width='16%'><font face='Arial, Helvetica, sans-serif' size='2'>97</FONT></TD><td align='center' width='22%'><font face='Arial, Helvetica, sans-serif' size='2'>101</FONT></TD><td align='center' width='16%'><font face='Arial, Helvetica, sans-serif' size='2'>4</FONT></TD></TR></FONT><TR><td align='center' width='8%'><font face='Arial,Helvetica,sans-serif' size='2'>S10406893</FONT></TD><td align='center' width='18%'><font face='Arial, Helvetica, sans-serif' size='2'>06-11-2013</FONT></TD><td align='center' width='12%'><font face='Arial, Helvetica, sans-serif' size='2'>07-11-2013</FONT></TD><td align='center' width='8%'><font face='Arial, Helvetica, sans-serif' size='2'>30</FONT></TD><td align='center' width='16%'><font face='Arial, Helvetica, sans-serif' size='2'>7915</FONT></TD><td align='center' width='22%'><font face='Arial, Helvetica, sans-serif' size='2'>8406</FONT></TD><td align='center' width='16%'><font face='Arial, Helvetica, sans-serif' size='2'>491</FONT></TD></TR></FONT></TABLE><input type='hidden' name='BillId' value='842892230704'></form>

                    </td>
                  </tr>
                </table>

The code I have used is: 我使用的代码是:

print BS('table')[0].text

But this gets only the first table content. 但这仅获取第一个表内容。

thanks for the help. 谢谢您的帮助。

No really sure what the OP is asking for here, but this is what I got out of the question: 没有真正知道什么是OP是要求在这里,但是这是我出了问题:

The below is the table tag, also I need the data which is present in the table which is inside the first table: 以下是表格标记,我也需要第一个表格内的表格中存在的数据:

From this sentence, I suspect you want the one line of text within the outer <table> , along with the text from the first inner <table> .There are many ways to go about this in BeautifulSoup , but this way made the most sense to me. 从这句话来看,我怀疑您想要外部<table>一行文本以及第一个内部 <table>的文本。在BeautifulSoup有很多方法可以解决此问题,但是这种方式最有意义对我来说。

# The variable "html" contains your sample html.
font_tags = html.findAll( 'font' )

# Now we print each piece of data wrapped in a <font> tag
for font_tag in font_tags:
    # This begins the second inner table, and we don't want that.
    if font_tag.text == u" Meter readings for this bill:":
        break
    else:
        print font_tag.text

This prints the following: 打印以下内容:

Account #: 8428995632 
Billing Date:   07-22-2013
Past Due Date:    08-12-2013
Service From: 06-11-2013
Service To:    07-11-2013
Days of Service: 30
Current Charges:    $30,488.60
Amount Due:   $30,488.60

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM