簡體   English   中英

無法使用請求解析網頁中不同項目的鏈接

[英]Can't parse the links of different items from a webpage using requests

我已經在Python中編寫了一個腳本,利用BeautifulSoup從網頁上抓取了不同項目的鏈接。 運行腳本時,在36個鏈接中只有6個鏈接。

盡管該頁面的其余內容是動態生成的,但我相信有任何一種優雅的方式都可以使用請求來捕獲它們。

網站地址

如何使用請求將它們全部獲取?

我嘗試過:

import requests
from bs4 import BeautifulSoup

link = "find the link above"

def get_links(link):
    res = requests.get(link,headers={"User-Agent":"Mozilla/5.0"})
    soup = BeautifulSoup(res.text,"lxml")
    for item_links in soup.select("#pull-results figure[data-pingdom-info='purchasable-deal']"):
        item_link = item_links.select_one("a[class^='cui-content']").get("href")
        yield item_link

if __name__ == '__main__':
    for elem in get_links(link):
        print(elem)

注意:我不希望找到與任何瀏覽器模擬器(例如硒)相關的解決方案。

數據是通過AJAX請求從其他URL加載的。 還必須設置正確的User-Agent 這將在標題旁邊打印所有36個鏈接:

import requests
from bs4 import BeautifulSoup

url = 'https://www.groupon.com/browse/search/partial?division=houston&badge=top-seller&query=med+spa&page=1'

headers = {'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:68.0) Gecko/20100101 Firefox/68.0'}

def get_links(link):
    json_data = requests.get(link, headers=headers).json()
    soup = BeautifulSoup( json_data['cardsHtml'], 'lxml' )
    for a, title in zip(soup.select('a.cui-content'), soup.select('.cui-udc-title')):
        yield a['href'], title.get_text(strip=True)

if __name__ == '__main__':
    print('{: <4}{: <40}{}'.format('No.', 'Title', 'URL'))
    print('-' * 120)
    for i, (link, title) in enumerate(get_links(url), 1):
        print('{: <4}{: <40}{}'.format('%s.' % i, title, link))

印刷品:

No. Title                                   URL
------------------------------------------------------------------------------------------------------------------------
1.  Body Envy Med Spa                       https://www.groupon.com/deals/body-envy-houston-5
2.  DermaNova Med Spa                       https://www.groupon.com/deals/dermanova-med-spa
3.  Limitless Medspa                        https://www.groupon.com/deals/limitless-med-spa-9
4.  New Heights Med Spa                     https://www.groupon.com/deals/new-heights-med-spa-6
5.  Wild Olive Beauty Haven                 https://www.groupon.com/deals/wild-olive-beauty-haven
6.  Urban Float                             https://www.groupon.com/deals/urban-float-houston-heights-3
7.  Glo Sun Spa Houston                     https://www.groupon.com/deals/glo-sun-spa-7
8.  Massage Heights Weslayan Plaza          https://www.groupon.com/deals/massage-heights-weslayan-plaza-4
9.  Hiatus Spa + Retreat                    https://www.groupon.com/deals/hiatus-spa-retreat-houston
10. Aura Brushed                            https://www.groupon.com/deals/aura-brushed
11. Heights Retreat Salon & Spa             https://www.groupon.com/deals/heights-retreat-new-ein
12. Woosah Massage and Wellness For Women   https://www.groupon.com/deals/woosah-massage-and-wellness
13. RD Laser Skin Solutions                 https://www.groupon.com/deals/rd-laser-skin-solutions-4
14. Clippers                                https://www.groupon.com/deals/clippers-2
15. Paige Larrick Electrology               https://www.groupon.com/deals/paige-larrick-electrology
16. Luxurious Sunless Tanning               https://www.groupon.com/deals/luxurious-sunless-tanning-2-4
17. LeLux Beautique                         https://www.groupon.com/deals/lelux-beautique-7
18. Paul Mitchell the School Houston        https://www.groupon.com/deals/paul-mitchell-the-school-houston
19. Faith Aesthetics                        https://www.groupon.com/deals/faith-aesthetics
20. Malibu Tan                              https://www.groupon.com/deals/malibu-tan-5
21. Maquillage Pro Beauty                   https://www.groupon.com/deals/maquillage-pro-beauty-2-14
22. E-Z Tan                                 https://www.groupon.com/deals/e-z-tan-3
23. Queen's Beauty Salon & Spa              https://www.groupon.com/deals/queens-beauty-salon-and-spa
24. MySmile Inc.                            https://www.groupon.com/deals/mysmile-inc-1
25. Blast Beauty Bar                        https://www.groupon.com/deals/blast-beauty-bar-2
26. No Hair Left Behind                     https://www.groupon.com/deals/no-hair-left-behind-1
27. BACS Clinic - Wellness Centre           https://www.groupon.com/deals/bacs-clinic
28. Soul The Beauty Bar And Yoni Spa        https://www.groupon.com/deals/soul-the-beauty-bar-and-yoni-spa
29. Touch Of Health Massage                 https://www.groupon.com/deals/touch-of-health-massage-1-3
30. Wink At U By Ryan                       https://www.groupon.com/deals/wink-at-u-by-ryan
31. Alanis Salon                            https://www.groupon.com/deals/alanis-salon-2
32. Perfected Lashes                        https://www.groupon.com/deals/perfected-lashes-1
33. Face It Makeup Studio                   https://www.groupon.com/deals/face-it-makeup-studio-3
34. Green Apple Salon                       https://www.groupon.com/deals/green-apple-salon-montrose-2
35. Snatched by J                           https://www.groupon.com/deals/snatched-by-j-body-fit
36. Premier Cosmetic                        https://www.groupon.com/deals/premier-cosmetic-4

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM