简体   繁体   English

带有 for 循环的美丽汤 find_all 仅返回第一个元素

[英]Beautiful Soup find_all with for loop only returning 1st element

I am looking for the total pages with airbnb listings by getting the last page and using that to make a range of pages to loop through.我正在寻找包含 airbnb 列表的总页面,方法是获取最后一页并使用它来制作一系列要循环的页面。

When using find_all(loc, {class:id}) and then trying to get all the page numbers in that section, I only return the first row(first page) The image below shows the rows I want to get the text for, so I can find the max number (10 in this case).当使用find_all(loc, {class:id})然后尝试获取该部分中的所有页码时,我只返回第一行(第一页)下图显示了我想要获取文本的行,所以我可以找到最大数量(在这种情况下为 10)。
the rows I want to access我要访问的行

When I do find all on div at that class, it only gives the first page number row, and the a, aria-label=Next当我在 class 的div上找到所有内容时,它只给出第一页码行,并且a, aria-label=Next

I have been playing around with multiple variations of the code below but it always only returns the first row of page numbers(2):我一直在玩弄下面代码的多种变体,但它总是只返回第一行页码(2):

import requests
from bs4 import BeautifulSoup

making editable parameters for scraping为抓取制作可编辑的参数

#checkin and checkout dates
checkin_checkout = ['checkin=2021-05-28&checkout=2021-05-30']
#number of adults for the listing to support
adults = 12
#total beds for the listing
n_beds = adults//2

getting the url得到 url

# url I am using    
nearby = '''https://www.airbnb.com/s/homes?tab_id=
        home_tab&refinement_paths%5B%5D=%2Fhomes&flexible_trip_dates%5B%5D=july&flexible_trip_dates%5B%5D=june&flexible_trip_lengths
        %5B%5D=weekend_trip&date_picker_type=calendar&
        location_search=NEARBY&
        {}&
        adults={}&
        source=structured_search_input_header&search_type=filter_change&room_types%5B%5D=
        Entire%20home%2Fapt&place_id=ChIJu-A79dZz44kRGu2B8kV8ylQ&
        min_beds={}'''.format(checkin_checkout, adults, n_beds)
        
res = requests.get(nearby)
print(res.status_code)

The part that is not returning what I want没有返回我想要的部分

#trying to access the html that holds the page numbers range
# shows up like this as buttons on the bottom of the page (1, 2, 3, 4, 5 ... 10)
div = soup.find_all('div', {'class': '_jro6t0'}) 
for row in div:
    print(row.find_all('a', {'class': '_1y623pm'}))

I tried this code and it still only prints the first line of page numbers with the class id is _1y623pm and where the text is 2我试过这段代码,它仍然只打印第一行页码,class id 是_1y623pm ,文本是 2

# This would like goes each div boxes
# div = soup.find_all('div', {'class': '_jro6t0'})
# for row in div:
#     This find only one result of each div-tag but I think it gives only one 
#     of it - like the image.
#     row.find_all('a', {'class': '_1y623pm'})

# First find all div-tags with classname:
div = soup.find_all('div', {'class': '_jro6t0'}) 
print(div)
# Then find innerhit the found div-tags all a-tags with classname:
a = div.find_all('a', {'class': '_1y623pm'})
for row in a:
    print(row.text)
    print(row.attrs)

@BuddyBob: I write comments on the post. @BuddyBob:我在帖子上写评论。 More or is it enough?更多还是足够?

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM