簡體 English 中英

如何從Beautiful Soup獲取URL？

[英]How to get a URL from Beautiful Soup?

原文 2016-05-02 17:37:01 3 2 javascript/ python/ html/ beautifulsoup/ html-parsing

我是Python的新手，試圖編寫一個爬行程序; 我想使用Beautiful Soup從BBC新聞中抓取一些數據。

但是當我使用Firebug檢查元素時，我發現此頁面中的HTML沒有URL鏈接。

<li class="">
<a class="navigation-wide-list__link navigation-arrow--open" data-panel-id="js-navigation-panel-World" href="/news/world">
    <span>World</span>
</a>

在href = '/news/world' ，它不顯示真實的URL鏈接。 如果我想抓取此網頁中的所有鏈接，該怎么辦？ 這是因為該網站使用的是Javascript嗎？

2 個解決方案

給定基本/當前URL以及來自href值的相對值，您需要生成絕對URL 。 建議的方法是使用urlparse.urljoin() ：

from urlparse import urljoin  # on Python 3: from urllib.parse import urljoin

absolute_url = urljoin(url, href)

回答你的最后一個小問題：

href是/news/world的價值並不奇怪。 這是一個相對引用 ，它在URI語法RFC的內容中指定。 Javascript不需要處理它們，自遠古以來它們一直受到瀏覽器的支持，鏈接到相對於當前文檔或相對於主機的文檔。

如何使用 Beautiful Soup 從 python 代碼中獲取 javascript 函數的結果？

[英]How to get the result of a javascript function from a python code using Beautiful Soup?

如何使用美麗湯獲取功能<script> tag?

[英]How to use Beautiful Soup to get function in <script> tag?

如何使用漂亮的湯從javascript數組中提取數據？

[英]How to extract data from javascript array using beautiful soup?

如何使用 Python 和 Beautiful-soup 從 Instagram 抓取標簽

[英]How to use Python and Beautiful-soup to scrape tags from Instagram

使用 Beautiful Soup 從 JavaScript 中提取數組值

[英]Extract array values from JavaScript with Beautiful Soup

如何使用Beautiful Soup訪問此項目

[英]How to access this item using Beautiful Soup

如何使用漂亮的湯將javascript添加到html中？

[英]How to add javascript to html using beautiful soup?

如何通過Beautiful Soup在href中抓取文本？

[英]How to scrape text in a href by Beautiful Soup?

使用 python 中的美麗湯保存 html 表中的所有列

[英]Saving all the columns from a html table using beautiful soup in python

使用 Beautiful Soup 從 Google 搜索中提取數據/鏈接

[英]Pull Data/Links from Google Searches using Beautiful Soup

暫無

暫無

聲明:本站的技術帖子網頁，遵循CC BY-SA 4.0協議，如果您需要轉載，請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

相關問題 如何使用 Beautiful Soup 從 python 代碼中獲取 javascript 函數的結果？如何使用美麗湯獲取功能<script> tag? 如何使用漂亮的湯從javascript數組中提取數據？如何使用 Python 和 Beautiful-soup 從 Instagram 抓取標簽使用 Beautiful Soup 從 JavaScript 中提取數組值如何使用Beautiful Soup訪問此項目如何使用漂亮的湯將javascript添加到html中？如何通過Beautiful Soup在href中抓取文本？使用 python 中的美麗湯保存 html 表中的所有列使用 Beautiful Soup 從 Google 搜索中提取數據/鏈接

相關標簽

粵ICP備18138465號 © 2020-2024 STACKOOM.COM