简体   繁体   English

为什么这个 python 脚本可以在我的本地机器上运行,而不能在 Heroku 上运行?

[英]Why does this python script work on my local machine but not on Heroku?

there.那里。 I'm building a simple scraping tool.我正在构建一个简单的抓取工具。 Here's the code that I have for it.这是我的代码。

from bs4 import BeautifulSoup
import requests
from lxml import html
import gspread
from oauth2client.service_account import ServiceAccountCredentials
import datetime

scope = ['https://spreadsheets.google.com/feeds']

credentials = ServiceAccountCredentials.from_json_keyfile_name('Programming 
4 Marketers-File-goes-here.json', scope)

site = 'http://nathanbarry.com/authority/'
hdr = {'User-Agent':'Mozilla/5.0'}
req = requests.get(site, headers=hdr)

soup = BeautifulSoup(req.content)

def getFullPrice(soup):
    divs = soup.find_all('div', id='complete-package')
    price = ""
    for i in divs:
        price = i.a
    completePrice = (str(price).split('$',1)[1]).split('<', 1)[0]
    return completePrice


def getVideoPrice(soup):
    divs = soup.find_all('div', id='video-package')
    price = ""
    for i in divs:
        price = i.a
    videoPrice = (str(price).split('$',1)[1]).split('<', 1)[0]
    return videoPrice

fullPrice = getFullPrice(soup)
videoPrice = getVideoPrice(soup)
date = datetime.date.today()

gc = gspread.authorize(credentials)
wks = gc.open("Authority Tracking").sheet1

row = len(wks.col_values(1))+1

wks.update_cell(row, 1, date)
wks.update_cell(row, 2, fullPrice)
wks.update_cell(row, 3, videoPrice)

This script runs on my local machine.这个脚本在我的本地机器上运行。 But, when I deploy it as a part of an app to Heroku and try to run it, I get the following error:但是,当我将它作为应用程序的一部分部署到 Heroku 并尝试运行它时,我收到以下错误:

Traceback (most recent call last): File "/app/.heroku/python/lib/python3.6/site-packages/gspread/client.py", line 219, in put_feed r = self.session.put(url, data, headers=headers) File "/app/.heroku/python/lib/python3.6/site-packages/gspread/httpsession.py", line 82, in put return self.request('PUT', url, params=params, data=data, **kwargs) File "/app/.heroku/python/lib/python3.6/site-packages/gspread/httpsession.py", line 69, in request response.status_code, response.content)) gspread.exceptions.RequestError: (400, "400: b'Invalid query parameter value for cell_id.'")回溯(最近一次调用):文件“/app/.heroku/python/lib/python3.6/site-packages/gspread/client.py”,第 219 行,在 put_feed r = self.session.put(url, data, headers=headers) 文件 "/app/.heroku/python/lib/python3.6/site-packages/gspread/httpsession.py", line 82, in put return self.request('PUT', url, params =params, data=data, **kwargs) 文件“/app/.heroku/python/lib/python3.6/site-packages/gspread/httpsession.py”,第 69 行,在请求 response.status_code,response.content 中)) gspread.exceptions.RequestError: (400, "400: b'cell_id 的查询参数值无效。'")

During handling of the above exception, another exception occurred:在处理上述异常的过程中,又发生了一个异常:

Traceback (most recent call last): File "AuthorityScraper.py", line 44, in wks.update_cell(row, 1, date) File "/app/.heroku/python/lib/python3.6/site-packages/gspread/models.py", line 517, in update_cell self.client.put_feed(uri, ElementTree.tostring(feed)) File "/app/.heroku/python/lib/python3.6/site-packages/gspread/client.py", line 221, in put_feed if ex[0] == 403: TypeError: 'RequestError' object does not support indexing回溯(最近一次调用):文件“AuthorityScraper.py”,第 44 行,在 wks.update_cell(row, 1, date) 文件“/app/.heroku/python/lib/python3.6/site-packages/gspread /models.py”,第 517 行,在 update_cell self.client.put_feed(uri, ElementTree.tostring(feed)) 文件“/app/.heroku/python/lib/python3.6/site-packages/gspread/client. py", line 221, in put_feed if ex[0] == 403: TypeError: 'RequestError' object does not support indexing

What do you think might be causing this error?您认为可能导致此错误的原因是什么? Do you have any suggestions for how I can fix it?你对我如何解决它有什么建议吗?

There are a couple of things going on:有几件事情正在发生:

1) The Google Sheets API returned an error: "Invalid query parameter value for cell_id": 1) Google Sheets API 返回错误:“cell_id 的查询参数值无效”:

gspread.exceptions.RequestError: (400, "400: b'Invalid query parameter value for cell_id.'") gspread.exceptions.RequestError: (400, "400: b'Invalid query parameter value for cell_id.'")

2) A bug in gspread caused an exception upon receipt of the error: 2) gspread一个 bug 在收到错误时导致异常:

TypeError: 'RequestError' object does not support indexing TypeError: 'RequestError' 对象不支持索引

Python 3 removed __getitem__ from BaseException , which this gspread error handling relies on. Python 3 从BaseException删除了__getitem__ ,这是gspread错误处理所依赖的。 This doesn't matter too much because it would have raised an UpdateCellError exception anyways.这并不重要,因为无论如何它都会引发UpdateCellError异常。

My guess is that you are passing an invalid row number to update_cell .我的猜测是您将无效的行号传递给update_cell It would be helpful to add some debug logging to your script to show, for example, which row it is trying to update.将一些调试日志添加到您的脚本中以显示例如它正在尝试更新的行会很有帮助。

It may be better to start with a worksheet with zero rows and use append_row instead.从零行的工作表开始并改用append_row可能会更好。 However there does seem to be an outstanding issue in gspread with append_row , and it may actually be the same issue you are running into.但是,在gspread使用append_row似乎确实存在一个悬而未决的问题,它实际上可能与您遇到的问题相同。

I encountered the same problem.我遇到了同样的问题。 BS4 works fine at a local machine. BS4 在本地机器上运行良好。 However, for some reason, it is way too slow in the Heroku server resulting into giving error.但是,出于某种原因,Heroku 服务器中的速度太慢,导致出现错误。

I switched to lxml and it is working fine now.我切换到 lxml,现在工作正常。

Install it by command:通过命令安装它:

pip install lxml

A sample code snippet is given below:下面给出了一个示例代码片段:

from lxml import html
import requests

getpage = requests.get("https://url_here")
gethtmlcontent = html.fromstring(getpage.content)
data = gethtmlcontent.xpath('//div[@class = "class-name"]/text()') 
#this is a sample for fetching data from the dummy div
data = data[0:n] # as per your requirement

#now inject the data into django tmeplate.

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 为什么我的 virtualenv python3 在我的本地机器上工作正常,但当我将 virtualenv 上传到服务器时却不行? - Why does my virtualenv python3 work fine on my local machine but not when I upload the virtualenv to the server? Web 应用程序适用于本地机器但不适用于 Heroku - Web Application works on local machine but does not work on Heroku 为什么我的代码只能在 Google Colab 中运行,而不能在我的本地机器上运行? - Why does my code only work in Google Colab but not on my local machine? heroku无法使用我的python脚本 - heroku cannot work with my python script 我的机器学习 model 在 Heroku 中不起作用,但在我的本地机器上完美运行 - My Machine Learning model doesn't work in Heroku but works perfectly on my local machine 为什么我的Python代码在Jupyter Notebook中工作,而不是作为脚本? - Why does my Python code work in Jupyter Notebook but not as a script? 为什么我的 python 脚本在 temal 和 cron 之间的工作方式不同? - Why does my python script work differently between the teminal and cron? 为什么这个python脚本不起作用? - why does this python script not work? 为什么这个 python 代码可以在我的 XUbuntu(Ubuntu 20.04)机器上工作,但不能在我的 Ubuntu 18.04 服务器上工作 - Why does this python code work on my XUbuntu (Ubuntu 20.04) machine but not my Ubuntu 18.04 Server 为什么我的Django应用程序在本地工作,而不是在Heroku上工作? - Why does my Django app work locally, but not on Heroku?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM