简体   繁体   English

使用Python进行多处理,执行从未完成

[英]Multiprocessing with Python, Execution never completes

New to multiprocessing! 多处理新手! please help. 请帮忙。

All libraries are imported, get_links method works, I've tested it on a single case. 所有库都已导入,get_links方法有效,我已经在一个案例中对其进行了测试。 Trying to make the method run for multiple urls that are designated to parallel processes to make it faster. 尝试使该方法针对指定给并行进程的多个URL运行,以使其更快。 Without multiprocessing my runtimes are 10 hours + 没有多处理,我的运行时间是10个小时以上

Edit 2: 编辑2:

Tried my best at a MCVE 在MCVE上尽我所能

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from bs4 import BeautifulSoup
from multiprocessing import Pool

options = Options()
options.headless = True
options.binary_location = 'C:\\Users\\Liam\\AppData\\Local\\Google\\Chrome SxS\\Application\\Chrome.exe'
options.add_argument('--blink-settings=imagesEnabled=false')
options.add_argument('--no-sandbox')
options.add_argument("--proxy-server='direct://'")
options.add_argument("--proxy-bypass-list=*")

subsubarea_urls = []
with open('subsubarea_urls.txt') as f:
    for item in f:
        item = item.strip()
        subsubarea_urls.append(item)

test_urls = subsubarea_urls[:3] 

def get_links(url):

    driver = webdriver.Chrome('....\Chromedriver', chrome_options=options)
    driver.get(url)
    soup = BeautifulSoup(driver.page_source, 'html.parser')
    link = soup.find(class_ = 'listings__all')
    if link is not None:
        link = "example.com" + link.find('a')['href']
    driver.close()
    return link

def main():

    how_many = 3
    p = Pool(processes = how_many)
    data = p.map(get_links, test_urls)
    p.close()

    with open('test_urls.txt', 'w') as f:
        f.write(str(data))

if __name__ == '__main__':
    main()

Unexpectedly the problem was not anything to do with the code. 出乎意料的是,问题与代码无关。 Multiprocessing in python does not seem to like Windows GUI's the sub processes called by Pool dont have std streams. python中的多处理似乎并不像Windows GUI那样,Pool调用的子进程没有std流。 The code needs to be executed in IDLE python -m idlelib.idle (To open IDLE) 该代码需要在IDLE python -m idlelib.idle中执行(要打开IDLE)

See Terry Jan Reedy's answer here 在这里查看Terry Jan Reedy的答案

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM