简体   繁体   English

从 csv 文件读取 url 并在 csv 文件中获取输出时出现问题

[英]Problem while reading urls from csv file and getting the output in csv file as well

The below code is supposed to return the status code and model number of few products from the website https://www.selexion.be/ .下面的代码应该从网站https://www.selexion.be/返回一些产品的状态代码和型号。 It was working fine when i put all the urls in urls array inside the code but when i am fetching url form csv file i am getting this error.当我将所有 url 放在代码中的 urls 数组中时,它工作正常,但是当我获取 url 表单 csv 文件时,我收到此错误。

Also i want to store the output url, status code and model number in an array and want to flush( .flush() & os.fsync() ) that array to csv file when all the link's status code and model number are fetched.此外,我想将输出 url、状态代码和型号存储在一个数组中,并希望在获取所有链接的状态代码和型号时将该数组刷新( .flush()os.fsync() )到 csv 文件。 Because i am getting the output in terminal but i want the output as well in a csv file.因为我在终端中获得输出,但我也希望在 csv 文件中输出。

Error:错误:

PS C:\Users\Zandrio> & C:/Users/Zandrio/AppData/Local/Programs/Python/Python38/python.exe "c:/Users/Zandrio/Documents/Advanced Project/Selexion.py"
Traceback (most recent call last):
  File "c:/Users/Zandrio/Documents/Advanced Project/Selexion.py", line 49, in <module>
    asyncio.run(main())
  File "C:\Users\Zandrio\AppData\Local\Programs\Python\Python38\lib\asyncio\runners.py", line 43, in run
    return loop.run_until_complete(main)
  File "C:\Users\Zandrio\AppData\Local\Programs\Python\Python38\lib\asyncio\base_events.py", line 612, in run_until_complete
    return future.result()
  File "c:/Users/Zandrio/Documents/Advanced Project/Selexion.py", line 41, in main
    await asyncio.gather(*(worker(f'w{index}', url, session)
  File "c:/Users/Zandrio/Documents/Advanced Project/Selexion.py", line 32, in worker
    response = await session.get(url, headers=header)
  File "C:\Users\Zandrio\AppData\Local\Programs\Python\Python38\lib\site-packages\aiohttp\client.py", line 380, in _request
    url = URL(str_or_url)
  File "C:\Users\Zandrio\AppData\Local\Programs\Python\Python38\lib\site-packages\yarl\__init__.py", line 149, in __new__
    raise TypeError("Constructor parameter should be str")
TypeError: Constructor parameter should be str

Code:代码:

import asyncio
import csv
import aiohttp
import time
from bs4 import BeautifulSoup

urls = []

try:

 with open('C:\\Users\\Zandrio\\Documents\\Advanced Project\\input_links.csv','r', newline='') as csvIO:
    urls = list(csv.reader(csvIO))

except FileNotFoundError:
    pass


header = {
'Host': 'www.selexion.be',
'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Firefox/68.0',
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Accept-Language': 'en-US,en;q=0.5',
'Accept-Encoding': 'gzip, deflate, br',
'Connection': 'keep-alive',
'Upgrade-Insecure-Requests': '1',
'Cache-Control': 'max-age=0',
'TE': 'Trailers'
}


async def worker(name, url, session):
    response = await session.get(url, headers=header)
    html = await response.read()
    soup = BeautifulSoup(html, features='lxml').select_one('.title-options span:first-of-type').text
    print(f'URL: {url} - {response.status} - {soup}')


async def main():
     async with aiohttp.ClientSession() as session:
            await asyncio.gather(*(worker(f'w{index}', url, session)
                            for index, url in enumerate(urls)))


if __name__ == '__main__':
    start = time.perf_counter()
    asyncio.run(main())
    elapsed = time.perf_counter() - start
    print(f'Executed in {elapsed:0.2f} seconds')

The error message says its a type error, meaning a function returns a parameter of type A, but type B was passed.错误消息说它是一个类型错误,这意味着一个函数返回一个类型 A 的参数,但类型 B 被传递了。

TypeError: Constructor parameter should be str

In the line在行中

await asyncio.gather(*(worker(f'w{index}', url, session)

what is the type of url?什么是网址类型? You can find out with你可以找到

type(url)

or by running your debugger.或者通过运行调试器。

What happens when you flip those two lines当你翻转这两行时会发生什么

 await asyncio.gather(*(worker(f'w{index}', url, session)
                            for index, url in enumerate(urls)))

I don't see where url comes from otherwise.否则我看不到 url 来自哪里。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM