SgmlLinkExtractor and regular expression for match word in a string

Question

I'm using the SgmlLinkExtractor functionality in scrapy to parse specific urls.

I override start_requests function to crawl dynamic url.

this looks like:

start_requests(self): ..... yield Requests(url.strip(), callbackA)

Callback A does nothing right now.

I also implemented process_value for the SgmlLinkExtractor but it never called.

This is the rule I'm using:

rules = [Rule(SgmlLinkExtractor(allow=()), callback=callbackB, follow=True),]

Again callbackB never called.

Answer 1

If your callbacks are declared in your spider, then they will not have global scope and you need to reference them as scoped to your class with self. :

rules = [
  Rule(SgmlLinkExtractor(), callback=self.callbackB, follow=True),
]

SgmlLinkExtractor and regular expression for match word in a string

Question

this looks like:

This is the rule I'm using:

1 answers

solution1
0 2012-07-24 14:32:37

SgmlLinkExtractor and regular expression for match word in a string

Question

this looks like:

This is the rule I'm using:

1 answers

solution1 0 2012-07-24 14:32:37

solution1
0 2012-07-24 14:32:37