简体   繁体   English

Scrapy使用指定的网卡python 3发送请求

[英]Scrapy sends an request using the specified network card python 3

I have created one scrapy project it is working well, I wanted it to host on the server to run it daily and it is working, But my server has two Network Card one is specially added for scrapy, still project is working but I wanted to use only one Network Card for scrapy or python and that I can specify that this Network card Python or Scrapy can use. 我创建了一个scrapy项目,它运行良好,我希望它可以托管在服务器上以每天运行,并且可以正常运行,但是我的服务器有两个网卡,其中一个专门为scrapy添加,仍然可以正常工作,但是我想仅将一张网卡用于scrapy或python,我可以指定此网卡Python或Scrapy可以使用。

Server: Windows 10 伺服器:Windows 10
Python: 3.6 的Python:3.6
Scrapy: 1.5 cra草:1.5

I was looking for the solution and found this Python sends an HTTP request using the specified network card on the internet but actually, I did not understand how it can be used. 我一直在寻找解决方案,发现此Python使用 Internet上的指定网卡发送了HTTP请求 ,但实际上,我不知道如何使用它。

Please help me to solve this solution may be like assign Network Card to python or assign Network card to socket or core library that scrapy used to request the website. 请帮助我解决此解决方案,例如将Network Card分配给python或将Network card分配给scrapy用于请求网站的套接字或核心库。

I dig deep for the solution and I found that the scrapy itself provides the requests meta bindaddress attribute to specify the address through that binding process is done. 我深挖的解决方案,我发现scrapy本身提供的请求元bindaddress属性来指定地址,通过绑定过程完成。

But it seems that scrapy documentation does not show how to use it but I came up with a download middleware that modifies the request and solves my problem and I called it BindAddressMiddleware . 但是似乎草率的文档没有显示如何使用它,但是我想出了一个下载中间件,该中间件可以修改请求并解决我的问题,因此我将其称为BindAddressMiddleware

What does the middleware do? 中间件做什么? It uses the settings 它使用设置

IS_MORE_NETWORK_CARDS = True the specific network card will be used if False then it won't IS_MORE_NETWORK_CARDS = True ,如果为False则使用特定的网卡,否则不会使用

BIND_ADDRESS = 127.0.0.1 the IP of the network card to be used BIND_ADDRESS = 127.0.0.1要使用的网卡的IP

use the download middleware for scrapy project in settings.py settings.py使用下载的中间件来抓取项目

DOWNLOADER_MIDDLEWARES = {
    # Bindaddress
    'scrapers22.middlewares.BindAddressMiddleware': 400,
}

the BindAddressMiddleware download middleware BindAddressMiddleware下载中间件

class BindAddressMiddleware(object):
    def __init__(self, settings):
        self.is_bindaddress = settings.get('IS_MORE_NETWORK_CARDS')
        if self.is_bindaddress:
            self.bindaddress = settings.get('BIND_ADDRESS')

    @classmethod
    def from_crawler(cls, crawler):
        return cls(crawler.settings)

    def process_request(self, request, spider):
        if self.is_bindaddress:
            if self.bindaddress:
                request.meta['bindaddress'] = (self.bindaddress, 0)
        return None

    def spider_opened(self, spider):
        spider.logger.info('Using: %s as bindaddress' % self.bindaddress)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM