简体   繁体   English

Heroku/Dash 应用程序 Python,读取 Google Cloud Storage 上的文件

[英]Heroku/Dash app Python, reading file on Google Cloud Storage

I have a dash web app deployed on Heroku, and it needs to read a.csv file located on Google Cloud Storage.我在 Heroku 上部署了一个破折号 web 应用程序,它需要读取位于 Google Cloud Storage 上的 .csv 文件。 To do this, I give to the app the credentials to access to my Google Cloud account, and then, I can load the file with pandas:为此,我向应用提供了访问我的 Google Cloud 帐户的凭据,然后我可以使用 pandas 加载文件:

import pandas as pd 
df = pd.read_csv("gs://bucket_name/file_name.csv")

This.csv file is updated regularly but the updates are not taken into account by the application. This.csv 文件会定期更新,但应用程序不会考虑更新。

The application load the file when deployed but then, it never re-load the file and so, it never takes updates into account until I deploy it again.应用程序在部署时加载文件,但之后,它永远不会重新加载文件,因此,在我再次部署它之前,它永远不会考虑更新。

Is there a way to force the app read the file each time I refresh the web browser so that each update is taken into account?有没有办法在我每次刷新 web 浏览器时强制应用程序读取文件,以便考虑每次更新?

Thank you in advance Best regards提前谢谢你 最好的问候

I think a decorator would be usefull here.我认为装饰器在这里很有用。 Please do take into account that depending on the size of the file you might get some extra latency as it needs to reload the df each time.请务必考虑到,根据文件的大小,您可能会遇到一些额外的延迟,因为它每次都需要重新加载df
You'll need to decorate each view that needs the df to be reloaded.您需要装饰每个需要重新加载df的视图。

Another one would be to set up a specific endpoint that forces a reload of the df and use Heroku's scheduler to call that endpoint.另一种方法是设置一个强制重新加载df的特定端点,并使用 Heroku 的调度程序调用该端点。 This will remove the extra latency on the other requests, but it will make it show stale data sometimes.这将消除其他请求的额外延迟,但有时会使其显示陈旧数据

See this short example below...请参阅下面的简短示例...

import Flask
from functools import wraps
import pandas as pd 


app = Flask(__name__)
df = pd.read_csv("gs://bucket_name/file_name.csv")


def reload_df(f):
    @wraps(f)
    def decorated_function(*args, **kwargs):
        global df
        df = pd.read_csv("gs://bucket_name/file_name.csv")
        return f(*args, **kwargs)
    return decorated_function


@app.route("/")
@reload_df
def index():
    return "hello world"


@app.route("/not_reloading_df")
def index():
    return "still using previous DF"


@app.route("/forcereload")
@reload_df
def force_reload():
    return "Reloaded DF"


if __name__ == "__main__":
    app.run(debug=True)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM