简体   繁体   中英

asyncio aiohttp errors when threading

I have a script written for me and I cannot execute it...I receive the following errors...

Traceback (most recent call last):

File "crawler.py", line 56, in loop.run_until_complete(future) File "C:\\Users\\lisa\\AppData\\Local\\Programs\\Python\\Python37-32\\lib\\asyncio\\base_events.py", line 568, in run_until_complete return future.result() File "crawler.py", line 51, in run await responses File "crawler.py", line 32, in bound_fetch await fetch(url, session) File "crawler.py", line 22, in fetch async with session.get(url, headers=headers) as response: File "C:\\Users\\lisa\\AppData\\Local\\Programs\\Python\\Python37-32\\lib\\site-packages\\aiohttp\\client.py", line 843, in aenter self._resp = await self._coro File "C:\\Users\\lisa\\AppData\\Local\\Programs\\Python\\Python37-32\\lib\\site-packages\\aiohttp\\client.py", line 387, in _request await resp.start(conn) File "C:\\Users\\lisa\\AppData\\Local\\Programs\\Python\\Python37-32\\lib\\site-packages\\aiohttp\\client_reqrep.py", line 748, in start message, payload = await self._protocol.read() File "C:\\Users\\lisa\\AppData\\Local\\Programs\\Python\\Python37-32\\lib\\site-packages\\aiohttp\\streams.py", line 53 3, in read await self._waiter aiohttp.client_exceptions.ServerDisconnectedError: None

Is there something obvious I'm missing? I can run the same script without threading, thanks...

import random
import asyncio
from aiohttp import ClientSession
import requests
from itertools import product
from string import *
from multiprocessing import Pool
from itertools import islice
import sys


headers = {'User-Agent': 'Mozilla/5.0'}

letter = sys.argv[1]
number = int(sys.argv[2])

first_group = product(ascii_lowercase, repeat=2)
second_group = product(digits, repeat=3)
codeList = [''.join([''.join(k) for k in prod]) for prod in product([letter], first_group, second_group)]

async def fetch(url, session):
    async with session.get(url, headers=headers) as response:
        statusCode = response.status
        if(statusCode == 200):
            print("{} statusCode is {}".format(url, statusCode))
        return await response.read()


async def bound_fetch(sem, url, session):
    async with sem:
        await fetch(url, session)

def getUrl(codeIdex):
    return "https://www.blahblah.com/" + codeList[codeIdex] + ".png"

async def run(r):
    tasks = []
    sem = asyncio.Semaphore(1000)

    async with ClientSession() as session:
        for i in range(r):
            task = asyncio.ensure_future(bound_fetch(sem, getUrl(i), session))
            tasks.append(task)

        responses = asyncio.gather(*tasks)
        await responses

loop = asyncio.get_event_loop()

future = asyncio.ensure_future(run(number))
loop.run_until_complete(future)

I can't leave comments so i'll write here. Try to set a lower number to semaphore limit, i think the problem is depends on how many requests you are making to the web site at the same time. And you also need to catch errors like that when you are trying to fetch any url if you want to finally get responses from all requests.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM