简体   繁体   English

在 iPhone Xs 上使用更多内存的风格转移模型

[英]Style transfer models using much more memory on iPhone Xs

I'm using Core ML models for image style transfer.我正在使用 Core ML 模型进行图像风格转换。 An initialized model takes ~60 MB memory on an iPhone X in iOS 12. However, the same model loaded on an iPhone Xs (Max) consumers more then 700 MB of RAM.初始化模型在 iOS 12 中的 iPhone X 上占用约 60 MB 内存。然而,加载在 iPhone Xs (Max) 上的相同模型消费者超过 700 MB 的 RAM。

In instruments I can see that the runtime allocates 38 IOSurfaces with up to 54 MB memory foodprint each alongside numerious other Core ML (Espresso) related objects.在仪器中,我可以看到运行时分配了 38 个 IOSurfaces,每个具有高达 54 MB 的内存食物印迹,以及许多其他与 Core ML (Espresso) 相关的对象。 Those are not there on the iPhone X. iPhone X 上没有这些。

My guess is that the Core ML runtime does something different in order to utilize the power of the A12.我的猜测是 Core ML 运行时会做一些不同的事情来利用 A12 的强大功能。 However, my app crashes due to the memory pressure.但是,由于内存压力,我的应用程序崩溃了。

I already tried to convert my models again with the newest version of coremltools .我已经尝试使用最新版本的coremltools再次转换我的模型。 However, they are identical.但是,它们是相同的。

Did I miss something?我错过了什么?

Here are some findings and a work-around I found:以下是我发现的一些发现和解决方法:

From what I could see in Instruments I conclude that the CoreML runtime is pre-allocating all buffers (hence the many IOSurfaces) required to execute the neural network when initializing the model (using the method Espresso::ANERuntimeEngine::blob_container::force_allocate() ).从我在 Instruments 中看到的我得出的结论是,CoreML 运行时正在预分配在初始化模型时执行神经网络所需的所有缓冲区(因此有许多 IOSurface)(使用方法Espresso::ANERuntimeEngine::blob_container::force_allocate() )。 Interestingly this only happens for models with a relatively high input size (1792 x 1792) and not for smaller ones (1024 x 1024).有趣的是,这只发生在输入尺寸相对较高 (1792 x 1792) 的模型中,而不适用于较小的模型 (1024 x 1024)。

Since this is only happening on the Xs I assumed it has something to do with the A12's Neural Engine.由于这仅发生在 Xs 上,我认为它与 A12 的神经引擎有关。 So I configured the model to use CPU and GPU only as compute unit ( MLComputeUnitsCPUAndGPU instead of MLComputeUnitsAll ) and that did the trick—no more pre-allocated buffers.因此,我将模型配置为仅使用 CPU 和 GPU 作为计算单元( MLComputeUnitsCPUAndGPU而不是MLComputeUnitsAll ), MLComputeUnitsAll成功了——不再有预分配的缓冲区。 So I'm using this as a work-around for now.所以我现在将其用作解决方法。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM