简体   繁体   English

如何在标准环境中将文件加载到Google-App-Engine

[英]How to load files to Google-App-Engine in standard enviroment

I am using Google-App-Engine standard (Not flex) Enviroment with Python2.7, and I need to load some pre-trained models (Gensim's Word2vec and Keras's LSTM). 我正在使用Google-App-Engine标准(非灵活)环境与Python2.7,我需要加载一些预先训练的模型(Gensim的Word2vec和Keras的LSTM)。

I need to load it once (since it very slow - takes around 1.5 seconds) and keep it in faster access for several hours. 我需要加载一次(因为它非常慢 - 需要大约1.5秒)并保持几个小时更快的访问速度。

What is the best & fastest way to do so? 最好和最快的方法是什么?

Thanks! 谢谢!

IMHO the best place for read-only data (including imported code!) needed to be accessed at any time by individual requests is the global application variables area. 恕我直言,个人请求随时需要访问的只读数据(包括导入的代码!)的最佳位置是全局应用程序变量区域。

Such variables would typically be loaded exactly once per GAE instance lifetime and available until the instance goes away. 这些变量通常每GAE实例生命周期恰好加载一次,并且在实例消失之前可用。

Since loading of the data is expensive you need to be aware that it could impact the response time for requests coming in while the instance is starting up (ie while the loading request is still active). 由于数据加载很昂贵,因此您需要注意,它可能会影响实例启动时进入请求的响应时间(即加载请求仍处于活动状态时)。 There are 2 ways to address this: 有两种方法可以解决这个问题:

  • one would be to use "lazy" loading of the data - effective if just a small percentage of the incoming requests actually need the data. 一种方法是使用“懒惰”加载数据 - 如果只有一小部分传入请求实际需要数据,则会生效。 But the requests which actually need the data when it's not available will still be affected, so it'll just reduce the impact of the problem. 但是,当数据不可用时实际需要数据的请求仍会受到影响,因此它只会减少问题的影响。 The method is described in detail in the App Engine Startup time and the Global Variable problem article: App Engine启动时间和全局变量问题文章中详细描述了该方法:

     from google.appengine.ext import ndb # a global variable gCDNServer = None def getCDN(): global gCDNServer if gCDNServer==None: gCDNServer = Settings.query(Settings.name == "gCDNServer").value return gCDNServer 
  • another approach, which would completely eliminate the problem, is to make your app support warmup requests (available only if you're using automatic scaling). 另一种可以完全消除问题的方法是让你的应用支持预热请求 (仅当你使用自动缩放时才可用)。 The data would be loaded by the warmup request handler and will always be available for "live" requests (because no "live" requests will be routed to the instance until the warmup request handling completes). 数据将由预热请求处理程序加载,并且始终可用于“实时”请求(因为在执行预热请求之前,没有“实时”请求将路由到实例)。

It might be possible to add logic to drop the data from memory (to reduce the app's memory footprint) if/when you know it'll no longer be needed (ie after those several hours you mentioned expired), but that complicates the picture, especially if you configured your app as threadsafe . 可能有可能添加逻辑以从内存中删除数据(以减少应用程序的内存占用量),如果/当您知道它将不再需要时(即在您提到过期的几个小时之后),但这会使图片复杂化,特别是如果您将应用程序配置为threadsafe I'd simply separate the code which doesn't need the data from the one which does in different services and leave autoscaling shut down the instances with the global data when no longer needed. 我只是将不需要数据的代码与不同服务中的数据分开,并让autoscaling在不再需要时使用全局数据关闭实例。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM