简体   繁体   中英

Multiprocessing with Python, Execution never completes

New to multiprocessing! please help.

All libraries are imported, get_links method works, I've tested it on a single case. Trying to make the method run for multiple urls that are designated to parallel processes to make it faster. Without multiprocessing my runtimes are 10 hours +

Edit 2:

Tried my best at a MCVE

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from bs4 import BeautifulSoup
from multiprocessing import Pool

options = Options()
options.headless = True
options.binary_location = 'C:\\Users\\Liam\\AppData\\Local\\Google\\Chrome SxS\\Application\\Chrome.exe'
options.add_argument('--blink-settings=imagesEnabled=false')
options.add_argument('--no-sandbox')
options.add_argument("--proxy-server='direct://'")
options.add_argument("--proxy-bypass-list=*")

subsubarea_urls = []
with open('subsubarea_urls.txt') as f:
    for item in f:
        item = item.strip()
        subsubarea_urls.append(item)

test_urls = subsubarea_urls[:3] 

def get_links(url):

    driver = webdriver.Chrome('....\Chromedriver', chrome_options=options)
    driver.get(url)
    soup = BeautifulSoup(driver.page_source, 'html.parser')
    link = soup.find(class_ = 'listings__all')
    if link is not None:
        link = "example.com" + link.find('a')['href']
    driver.close()
    return link

def main():

    how_many = 3
    p = Pool(processes = how_many)
    data = p.map(get_links, test_urls)
    p.close()

    with open('test_urls.txt', 'w') as f:
        f.write(str(data))

if __name__ == '__main__':
    main()

Unexpectedly the problem was not anything to do with the code. Multiprocessing in python does not seem to like Windows GUI's the sub processes called by Pool dont have std streams. The code needs to be executed in IDLE python -m idlelib.idle (To open IDLE)

See Terry Jan Reedy's answer here

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM