Python，BeautifulSoup解析表

Question

我正在解析表中的一些段落。

这是内容和代码。

txt = '''
<head><META http-equiv="Content-Type" content="text/html; charset=UTF-8">    </head><table><tr><th filter=all>Employee Name</th><th filter=all>Project     Name</th><th filter=all>Area</th><th filter=all>Date</th><th filter=all>Employee     Manager</th></tr>
<tr><td style="vnd.ms-excel.numberformat:@">David</td><td style="vnd.ms-    excel.numberformat:@">Review-2016</td><td style="vnd.ms-    excel.numberformat:@">US</td><td align=right>17/03/2016</td><td style="vnd.ms-    excel.numberformat:@">Andrew</td></tr>
<tr><td style="vnd.ms-excel.numberformat:@">Kate</td><td style="vnd.ms-excel.numberformat:@">Review 2016</td><td style="vnd.ms-excel.numberformat:@">UK</td><td align=right>21/03/2016</td><td style="vnd.ms-excel.numberformat:@">Liz</td></tr>

'''

soup = BeautifulSoup(txt, "lxml")
soup.prettify()

list_5 = soup.find_all('table')[0].find_all("tr")

for row in list_5:
    for nn in row.find_all("td"):
        print nn.text

到目前为止，所有的文本都已收集在一起，但是：

David
Review-2016
US
17/03/2016
Andrew
Kate
Review 2016
UK
21/03/2016
Liz

所需的是列形式，例如David，Kate或美国，英国等。

您能以正确的方式帮助我吗？ 谢谢。

Answer 1

如果要打印David, Kate ，则下面的代码将起作用：

 for row in list_5[1:]:
      print(row.find_all('td')[0].text)
 #change find_all('td')[0] to find_all('td')[2] will print US UK

Python，BeautifulSoup解析表

问题描述

1 个解决方案

解决方案1
2 已采纳 2017-04-17 03:37:32

Python，BeautifulSoup解析表

问题描述

1 个解决方案

解决方案1 2 已采纳 2017-04-17 03:37:32

解决方案1
2 已采纳 2017-04-17 03:37:32