如何提高我的程序在 Python 中的速度？

Question

我正在從事 web 抓取項目，我必須從 19062 設施獲取鏈接。 如果我使用 for 循環，將需要將近 3 個小時才能完成。 我嘗試制作生成器，但沒有做出任何邏輯，我不確定是否可以使用生成器完成。 那么，是否有任何 Python 專家有想法更快地得到我想要的東西？ 在我的代碼中，我只為 20 個 id 執行它。 謝謝


    import requests, json
    from bs4 import BeautifulSoup as bs
    
    
    url = 'https://hilfe.diakonie.de/hilfe-vor-ort/marker-json.php?ersteller=&kategorie=0&text=& n=55.0815&e=15.0418321&s=47.270127&w=5.8662579&zoom=20000'
    res = requests.get(url).json()
    
    url_1 = 'https://hilfe.diakonie.de/hilfe-vor-ort/info-window-html.php?id='
    
    # extracting all the id= from .json res object
    id = []
    
    for item in res['items'][0]["elements"]:
        id.append(item["id"])
    
    
    # opening a .json file and making a dict for links
    file = open('links.json', 'a')
    links = {'links': []}
    
    
    def link_parser(url, id):
        resp = requests.get(url + id).content
        soup = bs(resp, "html.parser")
        link = soup.select_one('p > a').attrs['href']
        links['links'].append(link)
    
    
    # dumping the dict into links.json file
    for item in id[:20]:
        link_parser(url_1, item)
    
    json.dump(links, file)
    file.close()

Answer 1

在web刮，速度不是個好主意。 如果您使用 For 循環，您將每秒多次訪問服務器，並且很可能會被阻止。 理想情況下，生成器不會使這更快。 您想訪問服務器一次並在本地處理數據。

如果是我，我會想使用像Scrapy這樣的框架，它鼓勵良好的實踐和各種蜘蛛類來支持標准技術。

如何提高我的程序在 Python 中的速度？

問題描述

1 個解決方案

解決方案1
0 2020-12-23 10:05:04

如何提高我的程序在 Python 中的速度？

問題描述

1 個解決方案

解決方案1 0 2020-12-23 10:05:04

解決方案1
0 2020-12-23 10:05:04