簡體 English 中英

使用Google Compute Engine / App Engine進行網頁搜刮

[英]Web Scraping with Google Compute Engine / App Engine

原文 2015-02-23 18:48:17 8 1 python/ google-app-engine/ cron/ web-scraping/ google-compute-engine

我編寫了一個Python腳本，該腳本使用Selenium從網站上抓取信息並將其存儲在csv文件中。 當我手動執行本地腳本時，它在我的本地計算機上運行良好，但現在我希望每小時自動運行一次腳本，持續幾個星期，以保護數據庫中的數據安全。 運行該腳本大約需要5-10分鍾。

我剛剛開始使用Google Cloud，看來有幾種使用Compute Engine或App Engine實施它的方法。 到目前為止，我一直使用到目前為止找到的所有三種方法（例如，讓計划的任務調用后端實例的URL並讓該實例啟動腳本）停留在某個點上。 我試圖：

通過Compute Engine執行腳本，並使用數據存儲區或Cloud sql。 不清楚crontab是否可以輕松設置。
在App Engine上使用任務隊列和計划任務。
在App Engine上使用后端實例和計划任務。

考慮到這確實是不需要用戶前端的后端腳本，我很想聽到其他人推薦的最簡單，最合適的方法。

1 個解決方案

App Engine是可行的，但.remote是您將Selenium的使用范圍限制為.remote到http://crossbrowsertesting.com/之類的網站-可行，但雜亂無章。

我會使用Compute Engine －在任何Linux映像上使用cron都很簡單，請參見例如http://www.thegeekstuff.com/2009/06/15-practical-crontab-examples/ ！

使用Google App Engine更新Google Compute Engine上的文件

[英]Using Google App Engine to update files on Google Compute Engine

如何將Google App Engine項目完全遷移到計算引擎？

[英]How to migrate Google App Engine Project to Compute Engine completely?

使用Google App Engine的Web服務

[英]Web Services with Google App Engine

Google Compute Engine OpenERP

[英]Google Compute Engine OpenERP

谷歌計算引擎示例

[英]Google compute engine example

使用Google App Engine進行網頁/屏幕刮刮 - 代碼適用於python解釋器，但不適用於GAE

[英]Web/Screen Scraping with Google App Engine - Code works in python interpreter but not GAE

如何在Google計算引擎上運行Chrome Web驅動程序？

[英]How can i run chromium web driver on google compute engine?

將數據從 App Engine 傳遞到 Compute Engine

[英]Passing Data from App Engine to Compute Engine

Google Compute Engine Firebase不是模塊

[英]Google Compute Engine firebase is not a module

Python多處理谷歌計算引擎

[英]python multiprocessing google compute engine

暫無

暫無

聲明:本站的技術帖子網頁，遵循CC BY-SA 4.0協議，如果您需要轉載，請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

相關問題 使用Google App Engine更新Google Compute Engine上的文件如何將Google App Engine項目完全遷移到計算引擎？使用Google App Engine的Web服務 Google Compute Engine OpenERP 谷歌計算引擎示例使用Google App Engine進行網頁/屏幕刮刮 - 代碼適用於python解釋器，但不適用於GAE 如何在Google計算引擎上運行Chrome Web驅動程序？將數據從 App Engine 傳遞到 Compute Engine Google Compute Engine Firebase不是模塊 Python多處理谷歌計算引擎

相關標簽

粵ICP備18138465號 © 2020-2024 STACKOOM.COM