["如何在解析的 html 頁面中擺脫 \"]

Question

["

!wget -q -O 'boroughs.html' "https://en.wikipedia.org/wiki/List_of_London_boroughs"

with open('boroughs.html', encoding='utf-8-sig') as fp:
    soup = BeautifulSoup(fp,"lxml")


data = []
table = soup.find("table", { "class" : "wikitable sortable" })
table_body = table.find('tbody')
rows = table_body.find_all('tr')
for row in rows:
    cols = row.find_all('td')
    cols = [col.text.strip() for col in cols]
    data.append([col for col in cols]) # Get rid of empty values
data

Answer 1

嘗試使用utf8代替：

with open('boroughs.html', encoding='utf8') as fp:
    doc = html.fromstring(fp.read())

    data = []
    rows = doc.xpath("//table/tbody/tr")
    for row in rows:
        cols = row.xpath("./td/text()")
        cols = [col.strip() for col in cols if col.strip()]
        data.append(cols)

Answer 2

["

import os
from bs4 import BeautifulSoup

os.system('wget -q -O "boroughs.html" "https://en.wikipedia.org/wiki/List_of_London_boroughs"')

with open('boroughs.html', encoding='utf-8-sig') as fp:
    soup = BeautifulSoup(fp,"lxml")

data = []
table = soup.find("table", { "class" : "wikitable sortable" })
table_body = table.find('tbody')
rows = table_body.find_all('tr')
for row in rows:
    cols = row.find_all('td')
    cols = [col.text.strip() for col in cols]
    data.append([col.replace(u'\ufeff', '') for col in cols])
print(data)

Answer 3

嘗試以下操作：

with open('boroughs.html', encoding='utf-8-sig') as fp:

["如何在解析的 html 頁面中擺脫 \"]

問題描述

3 個解決方案

解決方案1
0 2019-06-30 15:41:48

解決方案2
0 已采納 2019-06-30 18:41:10

解決方案3
0 2022-06-14 10:39:49

["如何在解析的 html 頁面中擺脫 \﻿"]

問題描述

3 個解決方案

解決方案1 0 2019-06-30 15:41:48

解決方案2 0 已采納 2019-06-30 18:41:10

解決方案3 0 2022-06-14 10:39:49

["如何在解析的 html 頁面中擺脫 \"]

解決方案1
0 2019-06-30 15:41:48

解決方案2
0 已采納 2019-06-30 18:41:10

解決方案3
0 2022-06-14 10:39:49