简体   繁体   English

Dotnet Core Docker 容器泄漏 Linux 上的 RAM 并导致 OOM

[英]Dotnet Core Docker Container Leaks RAM on Linux and causes OOM

I am running Dotnet Core 2.2 in a Linux container in Docker.我在 Docker 的 Linux 容器中运行 Dotnet Core 2.2。

I've tried many different configuration/environment options - but I keep coming back to the same problem of running out of memory ('docker events' reports an OOM).我尝试了许多不同的配置/环境选项 - 但我一直回到内存不足的相同问题(“docker events”报告 OOM)。

In production I'm hosting on Ubuntu.在生产中,我在 Ubuntu 上托管。 For Development, I'm using a Linux container (MobyLinux) on Docker in Windows.对于开发,我在 Windows 的 Docker 上使用 Linux 容器 (MobyLinux)。

I've gone back to running the Web API template project, rather than my actual app.我已经回去运行 Web API 模板项目,而不是我的实际应用程序。 I am literally returning a string and doing nothing else.我实际上是在返回一个字符串而不做其他任何事情。 If I call it about 1,000 times from curl, the container will die.如果我从 curl 调用它大约 1,000 次,容器就会死亡。 The garbage collector does not appear to be working at all.垃圾收集器似乎根本没有工作。

Tried setting the following environment variables in the docker-compose:尝试在 docker-compose 中设置以下环境变量:

DOTNET_RUNNING_IN_CONTAINER=true
DOTNET_SYSTEM_GLOBALIZATION_INVARIANT=true
ASPNETCORE_preventHostingStartup=true

Also tried the following in the docker-compose:还在 docker-compose 中尝试了以下内容:

mem_reservation: 128m
mem_limit: 256m
memswap_limit: 256m

(these only make it die faster) (这些只会让它死得更快)

Tried setting the following to true or false, no difference:尝试将以下设置为 true 或 false,没有区别:

ServerGarbageCollection

I have tried instead running as a Windows container, this doesn't OOM - but it does not seem to respect the memory limits either.我试过作为 Windows 容器运行,这不是 OOM - 但它似乎也不尊重内存限制。

I have already ruled out use of HttpClient and EF Core - as I'm not even using them in my example.我已经排除了使用 HttpClient 和 EF Core - 因为我什至没有在我的示例中使用它们。 I have read a bit about listening on port 443 as a problem - as I can leave the container running idle all day long, if I check at the end of the day - it's used up some more memory (not a massive amount, but it grows).我读过一些关于侦听端口 443 的问题 - 因为我可以让容器整天闲置,如果我在一天结束时检查 - 它消耗了更多内存(不是大量的,但它增长)。

Example of what's in my API:我的 API 中的内容示例:

// GET api/values/5
[HttpGet("{id}")]
public ActionResult<string> Get(int id)
{
return "You said: " + id;
}

Calling with Curl example:使用 Curl 调用示例:

curl -X GET "https://localhost:44329/api/values/7" -H  "accept: text/plain" --insecure

(repeated 1,000 or so times) (重复 1,000 次左右)

Expected: RAM usage to remain low for a very primitive request预期:对于非常原始的请求,RAM 使用率保持较低

Actual: RAM usage continues to grow until failure实际:RAM 使用量继续增长直到出现故障

Full Dockerfile:完整的 Dockerfile:

FROM microsoft/dotnet:2.2-aspnetcore-runtime AS base
WORKDIR /app
EXPOSE 80
EXPOSE 443

FROM microsoft/dotnet:2.2-sdk AS build
WORKDIR /src
COPY ["WebApplication1/WebApplication1.csproj", "WebApplication1/"]
RUN dotnet restore "WebApplication1/WebApplication1.csproj"
COPY . .
WORKDIR "/src/WebApplication1"
RUN dotnet build "WebApplication1.csproj" -c Release -o /app

FROM build AS publish
RUN dotnet publish "WebApplication1.csproj" -c Release -o /app

FROM base AS final
WORKDIR /app
COPY --from=publish /app .
ENTRYPOINT ["dotnet", "WebApplication1.dll"]

docker-compose.yml docker-compose.yml

version: '2.3'

services:
  webapplication1:
    image: ${DOCKER_REGISTRY-}webapplication1
    mem_reservation: 128m
    mem_limit: 256m
    memswap_limit: 256m
    cpu_percent: 25
    build:
      context: .
      dockerfile: WebApplication1/Dockerfile

docker-compose.override.yml docker-compose.override.yml

version: '2.3'

services:
  webapplication1:
    environment:
      - ASPNETCORE_ENVIRONMENT=Development
      - ASPNETCORE_URLS=https://+:443;http://+:80
      - ASPNETCORE_HTTPS_PORT=44329
      - DOTNET_RUNNING_IN_CONTAINER=true
      - DOTNET_SYSTEM_GLOBALIZATION_INVARIANT=true
      - ASPNETCORE_preventHostingStartup=true
    ports:
      - "50996:80"
      - "44329:443"
    volumes:
      - ${APPDATA}/ASP.NET/Https:/root/.aspnet/https:ro
      - ${APPDATA}/Microsoft/UserSecrets:/root/.microsoft/usersecrets:ro

I'm running Docker CE Engine 18.0.9.1 on Windows and 18.06.1 on Ubuntu.我在 Windows 上运行 Docker CE Engine 18.0.9.1,在 Ubuntu 上运行 18.06.1。 To confirm - I have also tried in Dotnet Core 2.1.确认 - 我也在 Dotnet Core 2.1 中尝试过。

I've also given it a try in IIS Express - the process gets to around 55MB, that's literally spamming it with multiple threads, etc.我还在 IIS Express 中尝试了它 - 进程达到大约 55MB,这实际上是用多个线程向它发送垃圾邮件,等等。

When they're all done, it goes down to around 29-35MB.当它们全部完成时,它会下降到大约 29-35MB。

This could be because garbage collection (GC) is not executed.这可能是因为未执行垃圾回收 (GC)。

Looking at this open issue it looks very similar:看看这个未解决的问题,它看起来非常相似:

https://github.com/dotnet/runtime/issues/851 https://github.com/dotnet/runtime/issues/851

One solution that made Ubuntu 18.04.4 work on a virtualized machine was using Workstation garbage collection (GC):使Ubuntu 18.04.4在虚拟机上运行的一种解决方案是使用工作站垃圾收集 (GC):

<PropertyGroup>
    <ServerGarbageCollection>false</ServerGarbageCollection>
</PropertyGroup>

https://github.com/dotnet/runtime/issues/851#issuecomment-644648315 https://github.com/dotnet/runtime/issues/851#issuecomment-644648315

https://github.com/dotnet/runtime/issues/851#issuecomment-438474207 https://github.com/dotnet/runtime/issues/851#issuecomment-438474207

https://docs.microsoft.com/en-us/dotnet/standard/garbage-collection/workstation-server-gc https://docs.microsoft.com/en-us/dotnet/standard/garbage-collection/workstation-server-gc

This is another finding:这是另一个发现:

After further investigations I've noticed that there is big difference between my servers in amount of available logical CPUs count (80 vs 16).经过进一步调查,我注意到我的服务器之间在可用逻辑 CPU 数量上存在很大差异(80 对 16)。 After some googling I came across this topic dotnet/runtime#622 that leads me to an experiments with CPU/GC/Threads settings.经过一番谷歌搜索后,我发现了这个主题dotnet/runtime#622 ,它引导我对 CPU/GC/Threads 设置进行实验。

I was using --cpus constraint in stack file;我在堆栈文件中使用--cpus 约束 explicitly set System.GC.Concurrent=true , System.GC.HeapCount=8 , System.GC.NoAffinitize=true , System.Threading.ThreadPool.MaxThreads=16 in runtimeconfig.template.json file;runtimeconfig.template.json文件中明确设置System.GC.Concurrent=true , System.GC.HeapCount=8 , System.GC.NoAffinitize=true , System.Threading.ThreadPool.MaxThreads=16 update image to a 3.1.301-bionic sdk and 3.1.5-bionic asp.net runtime — I made all this things in a various combinations and all of this had no effect.将图像更新为3.1.301-bionic sdk3.1.5-bionic asp.net 运行时——我以各种组合制作了所有这些东西,所有这些都没有效果。 Application just hangs until gets OOMKilled.应用程序只是挂起,直到被 OOMKilled。

The only thing that make it work with Server GC is --cpuset-cpus constraint.使其与服务器 GC --cpuset-cpus工作的唯一因素是--cpuset-cpus约束。 Of course, explicit setting of available processors is not an option for a docker swarm mode.当然,可用处理器的显式设置不是 docker swarm 模式的选项。 But I was experimenting with available cpus to find any regularity.但是我正在尝试使用可用的 CPU 来找到任何规律。 And here I got a few interesting facts.在这里,我得到了一些有趣的事实。

What is interesting, previously I have mirgated 3 other backend services to a new servers cluster and they all go well with a default settings.有趣的是,之前我已经将 3 个其他后端服务迁移到一个新的服务器集群,并且它们在默认设置下都运行良好。 Their memory limit is set to 600 Mb but in fact they need about 400 Mb to run.它们的内存限制设置为600 Mb但实际上它们需要大约400 Mb才能运行。 Things go wrong only with memory-consuming applications (I have two of those), it requires 3 Gb to build in-memory structures and runs with a 6 Gb constraint.只有消耗内存的应用程序才会出错(我有两个),它需要3 Gb来构建内存结构并在6 Gb约束下运行。

It keeps working in any range between [1, 35] available cpus and gets hanging when cpus count is 36 .它在[1, 35]可用 CPU 之间的任何范围内保持工作,并在 cpus 计数为36时挂起。

https://github.com/dotnet/runtime/issues/851#issuecomment-645237830 https://github.com/dotnet/runtime/issues/851#issuecomment-645237830

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM