Since we run scrapy spiders with its own terminal commands, how can I run my own defined functions?
Example below:
import scrapy
class Fcc(scrapy.Spider):
name = "fcc"
start_urls = ["http://freecodecamp.org/"]
def parse(self, response):
for link in response.css("a::attr(href)").getall():
yield {
"url": link,
}
def add(self):
with open("links.txt", "a") as f:
f.write(next(self.parse()))
So now if I run the spider from terminal by passing the below command, It will only execute the parse function. So how can I run the add function when I want to?
scrapy runspider fcc_spider.py
Because this will help me working with data I crawl from any website.
Ps. This is just an example, please don't give specific solutions for only this code, give solutions that can be used in any situation.
By default Scrapy execute start_requests
or parse
methods. You can use def __init__
to check for command line params and run your target function.
You can run your user defined functions by calling them in one of your Scrapy callbacks.
You could call it before or after the for
loop inside the parse
method (remember of the asynchronous nature of Scrapy).
You could also define a constructor for your Spider and pass the contents of the links.txt file to it.
Here is an example from the Scrapy documentation: https://docs.scrapy.org/en/latest/topics/spiders.html#spider-arguments
In Python, it's possible to create Inner Functions (function in function).
A function defined inside another function is known as an inner function or a nested function. In Python, this kind of function can access names in the enclosing function. Here's an example of how to create an inner function in Python:
def outer_func():
def inner_func():
print("Hello, World!")
inner_func()
outer_func()
Output:
Hello, World!
In this code, you define inner_func() inside outer_func() to print the Hello, World! message to the screen. To do that, you call inner_func() on the last line of outer_func(). This is the quickest way to write an inner function in Python. However, inner functions provide a lot of interesting possibilities beyond what you see in this example.
Read more here
INTEGRATION Example
Based on that you can create a function in one of the Scrapy functions, and call it within that function.
def parse_disease(self, response):
def function_name(name):
to_return = "hello {}".format(name)
return to_return
#Some code here...
pharam = function_name(name)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.