修改html文件（查找並替換href網址並保存）

Question

EDIT1：

我在原始代碼中發現了一個錯誤，該錯誤給了我typeError。 因此答案就在這里： BeautifulSoup-修改HTML中的所有鏈接？ 。 該代碼現在可以正常工作了。

我有一個html文件，我想為其他人更改一些href網址，然后再次將其另存為html文件。 我的目標是，當我打開html文件並單擊鏈接時，它將帶我到一個內部文件夾，而不是Internet網址（原始網址）。

我的意思是，我想將以下內容： <a href="http://www.somelink.com">轉換為： <a href="C:/myFolder/myFile.html"> 。

我試圖用bs4打開文件並使用替換功能，但是我遇到TypeError: 'NoneType' object is not callable

現在這是我的代碼：


# Dict which relates the original links with my the ones to replace them

links_dict = { original_link1 : my_link1 , original_link2 : my_link2 } # and so on..

# Get a list of links to loop and find them into the html file

original_links = links_dict .keys() 

soup = BeautifulSoup(open(html_file), "html.parser",encoding="utf8")

# This part is where I am stuck, the theory is loop through 'original_links'
 and if any of those links is found, replace it with the one I have in 'links_dict'

for link in soup.find_all('a',href=True):
    if link['href'] in links_dict:
        link['href'] = link['href'].replace(link['href'],links_dict[link['href']]

with open("new_file.html", "w",encoding="utf8") as file:
    file.write(str(soup))

有任何想法嗎？

Answer 1

處理完湯后，應該查找“ a”元素，然后檢查其“ href”屬性，如果它們與您的字典中的屬性匹配，請根據需要進行替換。

我會制作“ original_link1”等正則表達式，以便您輕松進行匹配。

碰巧的是，我相信您的問題已經得到解答，請參閱BeautifulSoup-修改HTML中的所有鏈接？

修改html文件（查找並替換href網址並保存）

問題描述

1 個解決方案

解決方案1
1 已采納 2019-05-20 11:49:31

修改html文件（查找並替換href網址並保存）

問題描述

1 個解決方案

解決方案1 1 已采納 2019-05-20 11:49:31

解決方案1
1 已采納 2019-05-20 11:49:31