简体   繁体   English

为什么此列表返回相同的值?

[英]Why does this list return identical values?

I'm trying to scrape http://www.virginiaequestrian.com/main.cfm?action=greenpages&GPType=8 for all of it's table values and putting the values in a list of lists. 我正在尝试为所有表值抓取http://www.virginiaequestrian.com/main.cfm?action=greenpages&GPType=8并将这些值放在列表中。 For some reason i can't seem to understand. 由于某种原因,我似乎无法理解。 Appending the info dict to the data list only puts in the one value 364 times(the length of the table). 将info dict追加到数据列表后,只能放入一个364倍(表的长度)的值。 I printed each line and value separately in the loop and i know i am grabbing the right elements/value, but everything seems to break down when i try to put the values in the data list. 我在循环中分别打印了每一行和每个值,我知道我正在获取正确的元素/值,但是当我尝试将值放入数据列表时,一切似乎都崩溃了。

Can somebody please tell me when i'm doing wrong? 当我做错事时可以告诉我吗?

from bs4 import BeautifulSoup
import requests

r=requests.get('http://www.virginiaequestrian.com/main.cfm?action=greenpages&GPType=8')
soup=BeautifulSoup(r.content,'html5lib')

data = []
info = {}

tbl = soup.findAll('table')[2]
for tr in tbl.findAll('tr')[3:]:
    for td in tr.findAll('td')[0]:
        value= td.string
        info['Name']=value
    for td in tr.findAll('td')[1]:
        value= td.string
        info['City']=value
    for td in tr.findAll('td')[2]:
        value= td.string
        info['Phone']=value
    for td in tr.findAll('td')[3]:
        value = "http://www.virginiaequestrian.com/{}".format(td.a['href'])
        info['ListURL']=value
        data.append(info)
print data

Objects in python (like your info dict) uses references to their underlying data structures. python中的对象(如您的info字典)使用对其底层数据结构的引用。 What you are basicaly doing when calling data.append(info) is appending the same reference to the same dict over and over again. 您在调用data.append(info)时所做的基本工作是一次又一次地将相同引用附加到相同字典上。

What you can do is either (re)create your info dict at each iteration of the outmost for-loop: 您可以做的是在最外层for循环的每次迭代中(重新)创建info字典:

for tr in tbl.findAll('tr')[3:]:
    info = {}
    ...

or append a copy of your dict into your list: 或将字典的副本添加到列表中:

data.append(info.copy())

creating a new object each time. 每次创建一个新对象。


You can also simplify your inner for-loops as iterating over one value is not really necessary: 您也可以简化内部for循环,因为实际上不需要遍历一个值:

for td in tr.findAll('td')[0]:
    value= td.string
    info['Name']=value
for td in tr.findAll('td')[1]:
    value= td.string
    info['City']=value
for td in tr.findAll('td')[2]:
    value= td.string
    info['Phone']=value
for td in tr.findAll('td')[3]:
    value = "http://www.virginiaequestrian.com/{}".format(td.a['href'])
    info['ListURL']=value

can become 可以变成

name, city, phone, url = tr.findAll('td')[:4]
info['Name'] = name.string
info['City'] = city.string
info['Phone'] = phone.string
info['ListURL'] = "http://www.virginiaequestrian.com/{}".format(url.a['href'])

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 使用python csv模块,为什么创建相同值的列表会加快行列表的创建? - With python csv module, why does creating a list of identical values speed up creation of a list of rows? 为什么 statsmodels 中的指数平滑会为时间序列预测返回相同的值? - Why does exponential smoothing in statsmodels return identical values for a time series forecast? 为什么旋转列表会产生相同的值? - why is rotating a list producing the identical values? 根据类型在列表上使用 tuple() 不会返回相同的元组 - using tuple() on a list does not return an identical tuple according to typing 为什么在python 3.5.2中比较相同的dict值会返回False? - Why does a comparison of identical dict values in python 3.5.2 returns False? 为什么对 zip() 调用的列表理解返回包含 zip object 的列表,而不是 zip() 的返回值列表? - Why does a list comprehension over a zip() call return a list containing the zip object instead of a list of zip()'s return values? 为什么我在列表中的值不会以不同方式计算奇数和偶数之和并将其返回到新列表中? - Why does my values in a list doesn't calculate sum odd and even numbers differently and return it in a new list? 检测列表中连续的相同值 - Detect consecutive identical values in list 在词典列表中更改相同的值 - Change identical values in a list of dictionaries 在相同值的列表中查找差异 - Find difference in list with identical values
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM