简体   繁体   English

python beautifulSoup findAll

[英]python beautifulSoup findAll

I am having an issue getting all of the data from this site... The section of the code I cannot get to produce all of the data is "pn" I am hoping this code would product these numbers from the site. 我在从该站点获取所有数据时遇到问题...我无法获取所有数据的代码部分是“ pn”,我希望这段代码可以从该站点生成这些数字。

58312-GA4 58312-RG4 58312-RR$ 58312-GA4 58312-RG4 58312-RR $

I have tried a number of things from switching the tags and classes and going back and fourth with find, findAll, and find_all and no matter what I try I am getting only one result. 我已经尝试了很多事情,包括切换标签和类,以及使用find,findAll和find_all来回进行第四次操作,无论我尝试什么,我只会得到一个结果。 Any help would be great - thanks Here is the code: 任何帮助都会很棒-谢谢,这是代码:

theurl="http://www.colehersee.com/home/grid/cat/14/?"
thepage = urllib.request.urlopen(theurl)
soup = BeautifulSoup(thepage,"html.parser")

for pn in soup.find('table',{"class":"mod_products_grid_listing"}).find_all('span',{"class":"product_code"}):
    pn2 = pn.text
for main in soup.find_all('nav',{"id":"breadcrumb"}):
    main1 = main.text

    print(pn2)
    print (main1)

You're running the for loop for getting the 'pn' value quite separately from the for loop for the 'main' value. 您正在运行for循环,以获取与“ main”值的for循环完全不同的“ pn”值。 To be specific, by the time your code reaches the second for loop, the previous for loop has already executed in its entirety. 具体来说,当您的代码到达第二个for循环时,前一个for循环已全部执行。

This results in the variable pn2 getting assigned the last value that was returned by the for loop. 这导致为变量pn2分配了for循环返回的最后一个值。

You might want to do something like 您可能想要做类似的事情

pn2 = []    
for pn in soup.find('table',{"class":"mod_products_grid_listing"}).find_all('span',{"class":"product_code"}):
    pn2.append(pn.text)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM