簡體 English 中英

scrapy無法抓取頁面中的所有鏈接

[英]scrapy can't crawl all links in a page

原文 2016-02-09 23:52:30 0 1 python/ shell/ xpath/ scrapy

我想scrapy抓取AJAX網站http://play.google.com/store/apps/category/GAME/collection/topselling_new_free

我想獲得所有指向每個游戲的鏈接。

我檢查頁面的元素。 它看起來像這樣：頁面的外觀如何，所以我想提取所有模式為/ store / apps / details的鏈接？id =

但是當我在shell中運行命令時，它什么也沒有返回： shell命令

我也嘗試過// a / @ href。 也沒有解決，但不知道怎么回事。

現在，我可以抓取到修改后的starturl並添加“ formdata”的前120個鏈接，就像有人告訴我的那樣，但此后沒有更多鏈接。

有人可以幫我弄這個嗎？

1 個解決方案

實際上，這是一個ajax-post-request ，用於填充該頁面上的數據。 在scrapy shell中，您將無法獲得此請求，而不是檢查元素，請檢查network選項卡，您將在其中找到請求。

使用formdata={'start':'0','num':'60','numChildren':'0','ipf':'1','xhr':'1'}向https://play.google.com/store/apps/category/GAME/collection/topselling_new_free?authuser=0 url發出發布請求formdata={'start':'0','num':'60','numChildren':'0','ipf':'1','xhr':'1'}

每個請求以60為增量遞增以獲取分頁結果。

Scrapy抓取所有站點地圖鏈接

[英]Scrapy crawl all sitemap links

不能用簡單的爬蟲爬網

[英]can't crawl page with simple scrapy spider

Scrapy 無法爬取所有頁面

[英]Scrapy can't crawl all pages

Scrapy需要抓取網站上的所有下一個鏈接，然后移至下一頁

[英]Scrapy needs to crawl all the next links on website and move on to the next page

Scrapy：請勿抓取其他域頁面上的鏈接

[英]Scrapy: Do not crawl links on other domains page

嘗試使用scrapy抓取網頁的所有鏈接。但是我無法在頁面上輸出鏈接

[英]Trying to crawl all links of a webpage with scrapy. But I cannot output the links on a page

無法抓取抓取

[英]Can't make scrapy crawl

Scrapy：存儲所有外部鏈接並抓取所有內部鏈接

[英]Scrapy: store all external links and crawl all interal links

Scrapy抓取提取的鏈接

[英]Scrapy crawl extracted links

在 Scrapy 中抓取提取的鏈接

[英]Crawl extracted links in Scrapy

暫無

暫無

聲明:本站的技術帖子網頁，遵循CC BY-SA 4.0協議，如果您需要轉載，請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

相關問題 Scrapy抓取所有站點地圖鏈接不能用簡單的爬蟲爬網 Scrapy 無法爬取所有頁面 Scrapy需要抓取網站上的所有下一個鏈接，然后移至下一頁 Scrapy：請勿抓取其他域頁面上的鏈接嘗試使用scrapy抓取網頁的所有鏈接。但是我無法在頁面上輸出鏈接無法抓取抓取 Scrapy：存儲所有外部鏈接並抓取所有內部鏈接 Scrapy抓取提取的鏈接在 Scrapy 中抓取提取的鏈接

相關標簽

粵ICP備18138465號 © 2020-2024 STACKOOM.COM