繁体   English   中英

在 Google colab 上运行 Playwright 会出现错误:asyncio.run() cannot be called from a running event loop

[英]Running Playwright on Google colab gives error : asyncio.run() cannot be called from a running event loop

我试图在 google colab 上运行playwright网络自动化,但无法在 colab 上运行事件循环。

这是我试过的

!pip install playwright
from playwright.sync_api import sync_playwright

with sync_playwright() as p:
    browser = p.firefox.launch(headless=True)
    page = browser.new_page()
    page.goto("https://www.google.com")

    page.wait_for_timeout(3000)
    browser.close()

这给了我错误

ERROR:root:An unexpected error occurred while tokenizing input
The following traceback may be corrupted or invalid
The error message is: ('EOF in multi-line string', (1, 33))

---------------------------------------------------------------------------
Error                                     Traceback (most recent call last)
<ipython-input-29-bc0f59648c4a> in <module>()
      1 from playwright.sync_api import sync_playwright
      2 
----> 3 with sync_playwright() as p:
      4     browser = p.firefox.launch(headless=True)
      5     page = browser.new_page()

/usr/local/lib/python3.7/dist-packages/playwright/sync_api/_context_manager.py in __enter__(self)
     44             raise Error(
     45                 """It looks like you are using Playwright Sync API inside the asyncio loop.
---> 46 Please use the Async API instead."""
     47             )
     48 

Error: It looks like you are using Playwright Sync API inside the asyncio loop.
Please use the Async API instead.

所以我尝试使用异步 API

import time
import asyncio
from playwright.async_api import async_playwright

async def main():
    async with async_playwright() as p:
        browser = await p.chromium.launch(headless=False)
        page = await browser.new_page(storage_state='auth.json')
        await page.goto('https://www.instagram.com/explore/tags/alanzoka/')
        time.sleep(6)
        html = await page.content()

        time.sleep(5)

        # await browser.close()


asyncio.run(main())

但这给了我错误

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-34-c582898e6ee9> in <module>()
     27 
     28 
---> 29 asyncio.run(main())

/usr/lib/python3.7/asyncio/runners.py in run(main, debug)
     32     if events._get_running_loop() is not None:
     33         raise RuntimeError(
---> 34             "asyncio.run() cannot be called from a running event loop")
     35 
     36     if not coroutines.iscoroutine(main):

RuntimeError: asyncio.run() cannot be called from a running event loop

我需要一个在 google colab 上设置和使用 playwright 包的工作解决方案。

不确定 Colab,但在普通的 Jupyter 笔记本中,您可以:

import nest_asyncio
nest_asyncio.apply()

使用pip install nest-asyncio ,然后您可以在笔记本中运行异步内容。

编辑:您还尝试使用headless=False运行 Chrome 的 GUI 实例 - 将其更改为headless=True ,Colab 不会使用 GUI 运行。

我刚刚在另一个 SO 评论中找到了这个答案。 我确认这有效。 https://stackoverflow.com/a/74518471/15898955

!apt install chromium-chromedriver

!pip install nest_asyncio
!pip install playwright

安装完以上所有依赖后,就可以在 Colab 中运行 playwright 脚本了。

import nest_asyncio
nest_asyncio.apply()

import asyncio
from playwright.async_api import async_playwright

async def main():
    async with async_playwright() as p:
        browser = await p.chromium.launch_persistent_context(
            executable_path="/usr/bin/chromium-browser",
            user_data_dir="/content/random-user"
        )
        page = await browser.new_page()
        await page.goto("https://google.com")
        title = await page.title()
        print(f"Title: {title}")
        await browser.close()

asyncio.run(main())

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM