[英]How To use multiprocess pool With Python Selenium
我有一些代碼需要從數百個網頁中獲取數據,我想通過為其運行多個 Selenium Chrome 瀏覽器實例來加快速度。 例如,我在這里有這個代碼:
from selenium import webdriver
from multiprocessing import Pool
from tkinter import *
#initiate browser
def browser():
global driver
driver = webdriver.Chrome(r"C:\Users\areed\Desktop\p\chromedriver.exe")
return driver
#test link
def test():
links = [link1.com, link2.com, link3.com, link4.com]
browser()
for l in links:
driver.get(l)
dostuff(driver)
#Scrape Data
def dostuff(driver):
print('doing Stuff')
#multiprocess Function
def multip():
pool = Pool(processes=4)
pool.map(test())
#tkinter Window
if __name__ == "__main__":
win = Tk()
win.title("test")
win.geometry('300x200')
btn = Button(win, text="Tester", command=multip)
btn.pack()
win.mainloop()
我如何才能到達此代碼運行多個 selenium chrome 瀏覽器的位置? 此代碼無需添加多進程即可正常工作。 有人可以向我解釋如何解決這個問題。 謝謝!
我寫了multiprocess的示例代碼。
您可以將鏈接設置為 test() 函數的參數。
每個瀏覽器都會導航到不同的鏈接。
from selenium import webdriver
from multiprocessing import Pool
# I remove global driver because you cannot use shared driver in multiprocess.
def browser():
driver = webdriver.Chrome()
return driver
def test_func(link):
driver = browser() # Each browser use different driver.
driver.get(link)
def multip():
links = ["https://stackoverflow.com/", "https://signup.microsoft.com/"]
pool = Pool(processes=3)
for i in range(0, len(links)):
pool.apply_async(test_func, args={links[i]})
pool.close()
pool.join()
if __name__ == '__main__':
multip()
我已經嘗試了上面的代碼並成功了。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.