使用BeautifulSoup从html解析表并将其另存为csv时出现问题

Question

import requests
import csv
import requests
from bs4 import BeautifulSoup

r = requests.get('https://pqt.cbp.gov/report/YYZ_1/12-01-2017')
soup = BeautifulSoup(r)
table = soup.find('table', attrs={ "class" : "table-horizontal-line"})
headers = [header.text for header in table.find_all('th')]
rows = []
for row in table.find_all('tr'):
    rows.append([val.text.encode('utf8') for val in row.find_all('td')])

with open('output_file.csv', 'wb') as f:
    writer = csv.writer(f)
    writer.writerow(headers)
    writer.writerows(row for row in rows if row)

我正在尝试解析此特定网页中的所有表数据： https : //pqt.cbp.gov/report/YYZ_1/12-01-2017

我在一行soup = BeautifulSoup(r)遇到错误。 我收到错误TypeError: object of type 'Response' has no len() 。 我也不确定我的逻辑是否正确。 请帮我粘贴表格数据。

Answer 1

我会这样

import pandas as pd
result = pd.read_html("https://pqt.cbp.gov/report/YYZ_1/12-01-2017")
df = result[0]
# df = df.drop(labels='Unnamed: 8', axis=1)
df.to_csv(r'C:\Users\User\Desktop\Data.csv', sep=',', encoding='utf-8',index = False )

Answer 2

尝试：

r = requests.get('https://pqt.cbp.gov/report/YYZ_1/12-01-2017')
soup = BeautifulSoup(r.content)

Answer 3

变量r是类型Response而不是str ，使用r.text或r.content ，并且没有带有table-horizontal-line类的table-horizontal-line ，您的意思是results吗？

soup = BeautifulSoup(r.text)
table = soup.find('table', attrs={"class" : "results"})

使用BeautifulSoup从html解析表并将其另存为csv时出现问题

问题描述

3 个解决方案

解决方案1
1 已采纳 2018-12-01 10:56:18

解决方案2
0 2018-12-01 10:31:39

解决方案3
0 2018-12-01 10:51:30

使用BeautifulSoup从html解析表并将其另存为csv时出现问题

问题描述

3 个解决方案

解决方案1 1 已采纳 2018-12-01 10:56:18

解决方案2 0 2018-12-01 10:31:39

解决方案3 0 2018-12-01 10:51:30

解决方案1
1 已采纳 2018-12-01 10:56:18

解决方案2
0 2018-12-01 10:31:39

解决方案3
0 2018-12-01 10:51:30