繁体   English   中英

谁能帮我进行python网站抓取,下面是代码

[英]Can anyone help me with the python web scraping, below is the code

import requests
from bs4 import BeautifulSoup
html = requests.get('https://www.bacb.com/services/o.php?page=101127&by=state&state=CA&pagenum=3').text
soup = BeautifulSoup(html, 'lxml')
type(soup)
print(soup.prettify())
table_rows = table.find_all('tr')
for tr in table_rows:
    td = tr.find_all('td')
    row = [i.text for i in td]
    print(row)

您稍后需要在数据上使用正则表达式

尝试这个

import requests 
from bs4 import BeautifulSoup 

html = requests.get('https://www.bacb.com/services/o.php? 
page=101127&by=state&state=CA&pagenum=3').text 
soup = BeautifulSoup(html, 'html.parser')
table_rows = soup.find_all('tr') 

for tr in table_rows: 
    td = tr.find_all('td') 
    row = [i.text for i in td] 
    print(row)

您的代码是正确的。 除了您使用“表格”而不是“汤”(第6行)。

import requests
from bs4 import BeautifulSoup
html = requests.get('https://www.bacb.com/services/o.php?page=101127&by=state&state=CA&pagenum=3').text
soup = BeautifulSoup(html, 'lxml')
# print(soup.prettify())
table_rows = soup.find_all('tr')
for tr in table_rows:
  td = tr.find_all('td')
  row = [i.text for i in td]
  print(row)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM