繁体   English   中英

线程仍然需要很长时间

[英]Threading still takes a very long time

我编写了一个脚本,用于为shopify网站构建结帐URL。 这是通过在结帐URL上附加每个唯一产品“变体” ID,然后在Web浏览器中打开所述URL来完成的。 要找到变体ID,我需要解析网站的站点地图以获得ID,我目前正在针对要解析的每种产品在单独的线程中进行此操作,但是随着每个线程的添加,所需的时间增加了很多(将近一个)第二)。

为什么会这样呢? 由于每个线程基本上都执行相同的操作,所以是否应该花费大约相同的时间?

作为参考,一个线程大约需要2.0s,两个线程大约2.8s,三个线程大约3.8s。

这是我的代码:

import time
import requests
from bs4 import BeautifulSoup
import webbrowser
import threading

sitemap2 = 'https://deadstock.ca/sitemap_products_1.xml'
atc_url = 'https://deadstock.ca/cart/'

# CHANGE SITEMAP TO THE CORRECT ONE (THE SITE YOU ARE SCRAPING)

variant_list = []


def add_to_cart(keywords, size):
    init = time.time()
    # Initialize session
    product_url = ''
    parse_session = requests.Session()
    response = parse_session.get(sitemap2)
    soup = BeautifulSoup(response.content, 'lxml')
    variant_id = 0

    # Find Item
    for urls in soup.find_all('url'):
        for images in urls.find_all('image:image'):
            if all(i in images.find('image:title').text.lower() for i in keywords):
                now = time.time()
                product_name = images.find('image:title').text
                print('FOUND: ' + product_name + ' - ' + str(format(now-init, '.3g')) + 's')
                product_url = urls.find("loc").text

    if product_url != '':
        response1 = parse_session.get(product_url+".xml")
        soup = BeautifulSoup(response1.content,'lxml')
        for variants in soup.find_all('variant'):
            if size in variants.find('title').text.lower():
                variant_id = variants.find('id', type='integer').text
                atc_link = str(variant_id)+':1'
                print(atc_link)
                variant_list.append(atc_link)


    try:
        print("PARSED PRODUCT: " + product_name)

    except UnboundLocalError:
        print("Retrying")
        add_to_cart(keywords, size)


def open_checkout():
    url = 'https://deadstock.ca/cart/'
    for var in variant_list:
        url = url + var + ','
    webbrowser.open_new_tab(url)




# When initializing a new thread, only change the keywords in the args, and make sure you start and join the thread.
# Change sitemap in scraper.py to your websites' sitemap

# If the script finds multiple items, the first item will be opened so please try to be very specific yet accurate.

def main():
    print("Starting Script")
    init = time.time()

    try:
        t1 = threading.Thread(target=add_to_cart, args=(['alltimers','relations','t-shirt','white'],'s',))
        t2 = threading.Thread(target=add_to_cart, args=(['alltimers', 'relations', 'maroon'],'s',))
        t3 = threading.Thread(target=add_to_cart, args=(['brain', 'dead','melter'], 's',))
        t1.start()
        t2.start()
        t3.start()
        t1.join()
        t2.join()
        t3.join()
        print(variant_list)
        open_checkout()
    except:
        print("Product not found / not yet live. Retrying..")
        main()

    print("Time taken: " + str(time.time()-init))

if __name__ == '__main__':
    main()

问题 :...一个线程大约需要2.0s,两个线程大约2.8s,三个线程大约3.8s

关于示例代码,您正在计算所有threads总和
正如#asettouf指出的那样,这有一笔间接费用 ,这意味着您必须为此付费。
但是我认为,执行这3个threaded任务将比一个接一个地执行更快。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM