简体繁体 English

Cloud Run - 请求延迟

[英]Cloud Run - Requests latency

原文 2020-09-22 09:12:38 8 1 python-3.x/ google-cloud-platform/ google-cloud-run/ fastapi/ s2

I am trying to use Cloud Run to run a microservice connected to Firestore.我正在尝试使用 Cloud Run 来运行连接到 Firestore 的微服务。 The microservice creates objects based on s2geometry to create multiple geographical zones with specific attributes and thus help localizing users to send them information according to the zone I locate them in.微服务基于s2geometry创建对象来创建具有特定属性的多个地理区域，从而帮助本地化用户根据我所在的区域向他们发送信息。

I used Python 3.7 and FastAPI to create the microservice and the routes to communicate with it.我使用 Python 3.7 和FastAPI来创建微服务和与之通信的路由。

The microservice runs smoothly on my local machine and on Compute Engines as most of my routes takes less than 150 ms to answer when I test them.微服务在我的本地机器和 Compute Engines 上运行顺利，因为我的大多数路由在我测试它们时需要不到 150 毫秒的时间来响应。 However I have a latency issue when I deploy it with Cloud Run.但是，当我使用 Cloud Run 部署它时，我遇到了延迟问题。 From time to time the microservice takes a really long time to answer (up to 15 mins) and I can't pin point what exactly causes it.有时，微服务需要很长时间才能回答（最多 15 分钟），我无法确定到底是什么原因造成的。

Here is a screen shot where we can see the Request Count and the Request Latency:这是一个屏幕截图，我们可以在其中看到请求计数和请求延迟：

Request Count and Request Latency请求计数和请求延迟

There are no real correlations between the requests latency and the number of requests or at least no trivial ones.请求延迟与请求数量之间没有真正的相关性，或者至少没有微不足道的相关性。 I also looked at the memory usage of the service and the memory usage is at 30% at most.我还查看了该服务的 memory 使用情况，memory 使用情况最多为 30%。 The CPU usage however some times hit 100% but not necessarily when requests are slow.然而，CPU 使用率有时会达到 100%，但在请求缓慢时不一定。

Finally when I explored the Trace List and compared requests that have high latency I noticed the following difference最后，当我探索跟踪列表并比较具有高延迟的请求时，我注意到以下差异

Trace of slow request慢请求的踪迹
Trace of fast request快速请求的踪迹

Fast requests seem to call themselves whereas slow requests don't and I do not know why.快速请求似乎会调用自己，而慢速请求则不会，我不知道为什么。

For now we do not really have a lot of users so I thought that it could be a cold start issue but slow requests are not necessarily the first ones.目前我们并没有真正有很多用户，所以我认为这可能是一个冷启动问题，但缓慢的请求不一定是第一个。

Now, to be honest I don't know what's going on here and what Cloud Run does (or what I did wrong) and I also find it pretty difficult to find a thorough explanation on how Cloud Run actually works so if you have one (other than the google one) I would gladly dive into it.现在，老实说，我不知道这里发生了什么，也不知道 Cloud Run 做了什么（或者我做错了什么），而且我也发现很难找到关于 Cloud Run 实际工作原理的详尽解释，所以如果你有一个（除了谷歌之外）我很乐意深入研究它。

Thank your very much for you help非常感谢你的帮助

1 个解决方案

After several tests it seems that it was acold start issue.经过多次测试，似乎是冷启动问题。 Cloud Run containers are stoped after a certain period if they are not being used and as we did not have a lot of traffic the container had to reboot every time a user wanted to access the app. Cloud Run 容器会在一段时间后停止，如果它们没有被使用，并且由于我们没有大量流量，每次用户想要访问应用程序时容器都必须重新启动。

Solution:解决方案：

I created a Cloud Function that sends a request to the container when triggered and then created a Cloud Scheduler job that runs the function every minute.我创建了一个Cloud Function ，它在触发时向容器发送请求，然后创建了一个每分钟运行 function 的Cloud Scheduler作业。

Note:笔记：

If different revisions are routed to your service you need to create a Cloud Scheduler job for each of the revision.如果将不同的修订路由到您的服务，您需要为每个修订创建一个 Cloud Scheduler 作业。 To do so you have to create a Revision URL (tag) for each of the routed revision (currently beta).为此，您必须为每个路由修订（目前是测试版）创建一个修订 URL （标签）。

Edit:编辑：

Now as @Jofre mentioned, you can chose to always have an instance of your service running by setting the "Minimum number of instances" to 1. If you are using the console GCP even tells you to "Set to 1 to reduce cold starts".现在正如@Jofre 提到的，您可以通过将“最小实例数”设置为 1 来选择始终运行服务实例。如果您使用的是控制台，GCP 甚至会告诉您“设置为 1 以减少冷启动” .