简体   繁体   中英

python beautifulSoup findAll

I am having an issue getting all of the data from this site... The section of the code I cannot get to produce all of the data is "pn" I am hoping this code would product these numbers from the site.

58312-GA4 58312-RG4 58312-RR$

I have tried a number of things from switching the tags and classes and going back and fourth with find, findAll, and find_all and no matter what I try I am getting only one result. Any help would be great - thanks Here is the code:

theurl="http://www.colehersee.com/home/grid/cat/14/?"
thepage = urllib.request.urlopen(theurl)
soup = BeautifulSoup(thepage,"html.parser")

for pn in soup.find('table',{"class":"mod_products_grid_listing"}).find_all('span',{"class":"product_code"}):
    pn2 = pn.text
for main in soup.find_all('nav',{"id":"breadcrumb"}):
    main1 = main.text

    print(pn2)
    print (main1)

You're running the for loop for getting the 'pn' value quite separately from the for loop for the 'main' value. To be specific, by the time your code reaches the second for loop, the previous for loop has already executed in its entirety.

This results in the variable pn2 getting assigned the last value that was returned by the for loop.

You might want to do something like

pn2 = []    
for pn in soup.find('table',{"class":"mod_products_grid_listing"}).find_all('span',{"class":"product_code"}):
    pn2.append(pn.text)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM