使用 python 抓取 web 页面

Question

Here is the code, it produces what I want but not in the way I want to output the result这是代码，它产生我想要的东西，但不是我想要的 output 结果

   import requests
    from bs4 import BeautifulSoup
    url = 'https://en.wikipedia.org/wiki/2020_coronavirus_pandemic_in_Florida'

    fl = requests.get(url)
    fl_soup = BeautifulSoup(fl.text, 'html.parser')
    block = fl_soup.findAll('td', {'class': 'bb-04em'})

    for name in fl_soup.findAll('td', {'class': 'bb-04em'}):
        print(name.text)

output output

2020-04-21 2020-04-21

27,869(+3.0%) 27,869(+3.0%)

867 867

I would like the output like this 2020-04-21 27,869(+3.0%) 867我想要这样的 output 2020-04-21 27,869(+3.0%) 867

Answer 1

The following should do what you want:以下应该做你想要的：

import requests
from bs4 import BeautifulSoup
url = 'https://en.wikipedia.org/wiki/2020_coronavirus_pandemic_in_Florida'

fl = requests.get(url)
fl_soup = BeautifulSoup(fl.text, 'html.parser')

div_with_table = fl_soup.find('div', {'class': 'barbox tright'})
table = div_with_table.find('table')

for row in table.findAll('tr'):
    for cell in row.findAll('td', {'class': 'bb-04em'}):
        print(cell.text, end=' ')
    print()  # new line for each row

Answer 2

Before accesing each <td> , try to get the data by each <tr> , you will get the information of each table row.在访问每个<td>之前，尝试通过每个<tr>获取数据，您将获得每个表行的信息。 Then you could search inside <td> or whatever you want.然后你可以在<td>内搜索或任何你想要的。

Answer 3

For the last print statement include the end parameter.对于最后一个打印语句，包括 end 参数。 By default the print statement has end='\n'默认情况下，打印语句有 end='\n'

print(name.text, end=' ')

This would give you the desired output.这将为您提供所需的 output。

使用 python 抓取 web 页面

问题描述

3 个解决方案

解决方案1
0 2020-04-22 14:14:00

解决方案2
0 2020-04-22 15:25:22

解决方案3
0 2020-04-22 21:34:17

使用 python 抓取 web 页面

问题描述

3 个解决方案

解决方案1 0 2020-04-22 14:14:00

解决方案2 0 2020-04-22 15:25:22

解决方案3 0 2020-04-22 21:34:17

解决方案1
0 2020-04-22 14:14:00

解决方案2
0 2020-04-22 15:25:22

解决方案3
0 2020-04-22 21:34:17