I'm working with scrapy. I have a spider that starts with:
class For_Spider(Spider):
name = "for"
# table = 'hello' # creating dummy attribute. will be overwritten
def start_requests(self):
self.table = self.dc # dc is passed in
I have the following pipeline :
class DynamicSQLlitePipeline(object):
@classmethod
def from_crawler(cls, crawler):
# Here, you get whatever value was passed through the "table" parameter
table = getattr(crawler.spider, "table")
return cls(table)
def __init__(self,table):
try:
db_path = "sqlite:///"+settings.SETTINGS_PATH+"\\data.db"
db = dataset.connect(db_path)
table_name = table[0:3] # FIRST 3 LETTERS
self.my_table = db[table_name]
When I start the spider with:
scrapy crawl for -a dc=input_string -a records=1
I get:
AttributeError: 'For_Spider' object has no attribute 'table'
If I uncomment 'table' , the program will start. I'm confused about why 'table' works but self.table does not. Can someone explain this?
table
will work because it is a class attribute of For_Spider
and self.table
is just inside the function scope. self
indicates the instance itself, so in that case inside the function you don't need to use it (unless you define it in __init__
).
If you'll try defining self.table
outside the function scope you'll get an error.
Also, try using __dict__
on both classes to see their attributes and functions
With table commented:
{' doc ': None, 'start_requests': , 'name': 'for', ' module ': 'builtins'})
As you can see, no table
attribute
With table not commented:
{' doc ': None, 'start_requests': , 'table': 'hello', 'name': 'for', ' module ': 'builtins'})
I hope that was clear :>
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.