简体   繁体   中英

How To use multiprocess pool With Python Selenium

I have some code that needs to go and grab data from hundreds of web pages and I would like to speed this up by running multiple instances of Selenium Chrome browser for it. For example I have this code here:

from selenium import webdriver
from multiprocessing import Pool
from tkinter import *

#initiate browser
def browser():
    global driver
    driver = webdriver.Chrome(r"C:\Users\areed\Desktop\p\chromedriver.exe")
    return driver

#test link
def test():
    links = [link1.com, link2.com, link3.com, link4.com]
    browser()
    for l in links:
        driver.get(l)
        dostuff(driver)

#Scrape Data
def dostuff(driver):
    print('doing Stuff')

#multiprocess Function      
def multip():
    pool = Pool(processes=4)
    pool.map(test())

#tkinter Window
if __name__ == "__main__":  
    win = Tk()
    win.title("test")
    win.geometry('300x200')
    btn = Button(win, text="Tester", command=multip)
    btn.pack()
    win.mainloop()

How can i make it to where this code runs multiple selenium chrome browsers? This code works just fine without adding the multi process to it. Can someone please explain to me how to fix this. Thanks!

I write the sample code of mulitiprocess.

You can set the link as argument of test() function.

Each browser will navigate to diffrent link.

from selenium import webdriver
from multiprocessing import Pool

# I remove global driver because you cannot use shared driver in multiprocess.
def browser():  
    driver = webdriver.Chrome()
    return driver
 
def test_func(link):
    driver = browser()  # Each browser use different driver.
    driver.get(link)

def multip():
    links = ["https://stackoverflow.com/", "https://signup.microsoft.com/"]
    pool = Pool(processes=3)
    for i in range(0, len(links)):  
        pool.apply_async(test_func, args={links[i]})

    pool.close()
    pool.join()
    
 if __name__ == '__main__':
     multip()

I have tried above code and became successful.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM