简体   繁体   中英

Can't parse the links of different items from a webpage using requests

I've written a script in python making use of BeautifulSoup to scrape the links of different items from a webpage. When I run my script, I get only 6 links out of 36 links.

Although the rest of the content of that page generate dynamically, I believe there is any elegant way of grabbing them using requests.

Website address

How can I get them all using requests?

I've tried with:

import requests
from bs4 import BeautifulSoup

link = "find the link above"

def get_links(link):
    res = requests.get(link,headers={"User-Agent":"Mozilla/5.0"})
    soup = BeautifulSoup(res.text,"lxml")
    for item_links in soup.select("#pull-results figure[data-pingdom-info='purchasable-deal']"):
        item_link = item_links.select_one("a[class^='cui-content']").get("href")
        yield item_link

if __name__ == '__main__':
    for elem in get_links(link):
        print(elem)

NOTE: I'm not after any solution related to any browser simulator like selenium.

The data is loaded from different URL via AJAX request. It's also necessary to set correct User-Agent . This prints all 36 links alongside their titles:

import requests
from bs4 import BeautifulSoup

url = 'https://www.groupon.com/browse/search/partial?division=houston&badge=top-seller&query=med+spa&page=1'

headers = {'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:68.0) Gecko/20100101 Firefox/68.0'}

def get_links(link):
    json_data = requests.get(link, headers=headers).json()
    soup = BeautifulSoup( json_data['cardsHtml'], 'lxml' )
    for a, title in zip(soup.select('a.cui-content'), soup.select('.cui-udc-title')):
        yield a['href'], title.get_text(strip=True)

if __name__ == '__main__':
    print('{: <4}{: <40}{}'.format('No.', 'Title', 'URL'))
    print('-' * 120)
    for i, (link, title) in enumerate(get_links(url), 1):
        print('{: <4}{: <40}{}'.format('%s.' % i, title, link))

Prints:

No. Title                                   URL
------------------------------------------------------------------------------------------------------------------------
1.  Body Envy Med Spa                       https://www.groupon.com/deals/body-envy-houston-5
2.  DermaNova Med Spa                       https://www.groupon.com/deals/dermanova-med-spa
3.  Limitless Medspa                        https://www.groupon.com/deals/limitless-med-spa-9
4.  New Heights Med Spa                     https://www.groupon.com/deals/new-heights-med-spa-6
5.  Wild Olive Beauty Haven                 https://www.groupon.com/deals/wild-olive-beauty-haven
6.  Urban Float                             https://www.groupon.com/deals/urban-float-houston-heights-3
7.  Glo Sun Spa Houston                     https://www.groupon.com/deals/glo-sun-spa-7
8.  Massage Heights Weslayan Plaza          https://www.groupon.com/deals/massage-heights-weslayan-plaza-4
9.  Hiatus Spa + Retreat                    https://www.groupon.com/deals/hiatus-spa-retreat-houston
10. Aura Brushed                            https://www.groupon.com/deals/aura-brushed
11. Heights Retreat Salon & Spa             https://www.groupon.com/deals/heights-retreat-new-ein
12. Woosah Massage and Wellness For Women   https://www.groupon.com/deals/woosah-massage-and-wellness
13. RD Laser Skin Solutions                 https://www.groupon.com/deals/rd-laser-skin-solutions-4
14. Clippers                                https://www.groupon.com/deals/clippers-2
15. Paige Larrick Electrology               https://www.groupon.com/deals/paige-larrick-electrology
16. Luxurious Sunless Tanning               https://www.groupon.com/deals/luxurious-sunless-tanning-2-4
17. LeLux Beautique                         https://www.groupon.com/deals/lelux-beautique-7
18. Paul Mitchell the School Houston        https://www.groupon.com/deals/paul-mitchell-the-school-houston
19. Faith Aesthetics                        https://www.groupon.com/deals/faith-aesthetics
20. Malibu Tan                              https://www.groupon.com/deals/malibu-tan-5
21. Maquillage Pro Beauty                   https://www.groupon.com/deals/maquillage-pro-beauty-2-14
22. E-Z Tan                                 https://www.groupon.com/deals/e-z-tan-3
23. Queen's Beauty Salon & Spa              https://www.groupon.com/deals/queens-beauty-salon-and-spa
24. MySmile Inc.                            https://www.groupon.com/deals/mysmile-inc-1
25. Blast Beauty Bar                        https://www.groupon.com/deals/blast-beauty-bar-2
26. No Hair Left Behind                     https://www.groupon.com/deals/no-hair-left-behind-1
27. BACS Clinic - Wellness Centre           https://www.groupon.com/deals/bacs-clinic
28. Soul The Beauty Bar And Yoni Spa        https://www.groupon.com/deals/soul-the-beauty-bar-and-yoni-spa
29. Touch Of Health Massage                 https://www.groupon.com/deals/touch-of-health-massage-1-3
30. Wink At U By Ryan                       https://www.groupon.com/deals/wink-at-u-by-ryan
31. Alanis Salon                            https://www.groupon.com/deals/alanis-salon-2
32. Perfected Lashes                        https://www.groupon.com/deals/perfected-lashes-1
33. Face It Makeup Studio                   https://www.groupon.com/deals/face-it-makeup-studio-3
34. Green Apple Salon                       https://www.groupon.com/deals/green-apple-salon-montrose-2
35. Snatched by J                           https://www.groupon.com/deals/snatched-by-j-body-fit
36. Premier Cosmetic                        https://www.groupon.com/deals/premier-cosmetic-4

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM