简体   繁体   English

使用 python 抓取 web 页面

[英]scraping a web page with python

Here is the code, it produces what I want but not in the way I want to output the result这是代码,它产生我想要的东西,但不是我想要的 output 结果


   import requests
    from bs4 import BeautifulSoup
    url = 'https://en.wikipedia.org/wiki/2020_coronavirus_pandemic_in_Florida'

    fl = requests.get(url)
    fl_soup = BeautifulSoup(fl.text, 'html.parser')
    block = fl_soup.findAll('td', {'class': 'bb-04em'})

    for name in fl_soup.findAll('td', {'class': 'bb-04em'}):
        print(name.text)

output output

2020-04-21 2020-04-21

27,869(+3.0%) 27,869(+3.0%)

867 867

I would like the output like this 2020-04-21 27,869(+3.0%) 867我想要这样的 output 2020-04-21 27,869(+3.0%) 867

The following should do what you want:以下应该做你想要的:

import requests
from bs4 import BeautifulSoup
url = 'https://en.wikipedia.org/wiki/2020_coronavirus_pandemic_in_Florida'

fl = requests.get(url)
fl_soup = BeautifulSoup(fl.text, 'html.parser')

div_with_table = fl_soup.find('div', {'class': 'barbox tright'})
table = div_with_table.find('table')

for row in table.findAll('tr'):
    for cell in row.findAll('td', {'class': 'bb-04em'}):
        print(cell.text, end=' ')
    print()  # new line for each row

Before accesing each <td> , try to get the data by each <tr> , you will get the information of each table row.在访问每个<td>之前,尝试通过每个<tr>获取数据,您将获得每个表行的信息。 Then you could search inside <td> or whatever you want.然后你可以在<td>内搜索或任何你想要的。

For the last print statement include the end parameter.对于最后一个打印语句,包括 end 参数。 By default the print statement has end='\n'默认情况下,打印语句有 end='\n'

print(name.text, end=' ')

This would give you the desired output.这将为您提供所需的 output。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM