简体   繁体   English

Scrapy仅在循环中返回第一个结果

[英]Scrapy Only Returning First Result in Loop

I have a loop (as shown below) that executes twice (indexes 1->3), but Scrapy only returns the first trackname in both results. 我有一个循环(如下所示)执行两次(索引1-> 3),但是Scrapy在两个结果中仅返回第一个音轨名称。 But the print item line shows different values for str_selector so I know the loop works but Scrapy isn't seeing the changing value of x . 但是print item行为str_selector显示了不同的值,所以我知道循环有效,但是Scrapy没有看到x的变化值。

Any idea what mistake I have made? 知道我犯了什么错误吗?

items = []
item = scrapyItem()

for x in range (1,3):
    str_selector = '//tr[@name="tracks-grid-browse_track_{0}"]/td[contains(@class,"secondColumn")]/a/text()'.format(x)
    item['trackname'] = hxs.select(str_selector).extract()
    print item
    items.append(item)
return items

It's just that you should build a new item for each iteration, instead of keeping the same: you add in items the same object, which is mutable (as for all user-defined classes by default in python) and so when you update item['trackname'] , all items contained are updated ! 只是您应该为每次迭代构建一个新项目,而不要保持不变:您将相同的对象添加到items该对象是可变的 (对于python中默认情况下所有用户定义的类),因此在更新item['trackname'] ,其中包含的所有项目均已更新!

Here is some code to illustrate: 这是一些代码说明:

>>> class C(object):
        # Basic user-defined class
    def __init__(self):
        self.test = None


>>> c = C()
>>> items = []
>>> for x in range (1,3):
    c.test = x
    print c, c.test
    items.append(c)


<__main__.C object at 0x01CEB130> 1
<__main__.C object at 0x01CEB130> 2
>>> items # All objects contained are the same !!!
[<__main__.C object at 0x01CEB130>, <__main__.C object at 0x01CEB130>]
>>> for c in items:
    print c.test


2
2

Now create a new object each time: 现在每次创建一个新对象:

>>> items = []
>>> for x in range (1,3):
    c = C()
    c.test = x
    print c, c.test
    items.append(c)


<__main__.C object at 0x01CEB110> 1
<__main__.C object at 0x011F2270> 2

Objects are now different ! 对象现在不同了!

>>> for c in items:
    print c.test


1
2

what actually you are doing right now is creating an item object and changing its value in loop, you need to create item in loop. 您现在实际要做的是创建一个item对象并循环更改其值,您需要创建一个item循环。

items = []
#item = scrapyItem()

for x in range (1,3):
    item = scrapyItem()
    str_selector = '//tr[@name="tracks-grid-browse_track_{0}"]/td[contains(@class,"secondColumn")]/a/text()'.format(x)
    item['trackname'] = hxs.select(str_selector).extract()
    print item
    items.append(item)
return items

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM