简体   繁体   中英

Using ARM Neon acceleration with TFLite C++ API on Android

I am trying to utilize Neon acceleration for TFLite inference on an Android device. While this appears to be well documented and straightforward for Java, I could use help in getting started with the C++ API. I am new to this, so my apologies if the answer is obvious.

The TensorFlow Lite library contains source for Neon in https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/kernels , but I am wondering how and where to include it and use it.

The device specs are: Processor Octa-core, 2000 MHz, ARM Cortex-A75 and ARM Cortex-A53, 64-bit, 10 nm. CPU (2x2.0 GHz 360 Gold & 6x1.7 GHz Kryo 360 Silver); GPU Adreno 615.

What I've tried so far: I changed the build.gradle file from

android {
    externalNativeBuild {
        cmake {
            path file('CMakeLists.txt')
        }
}

to

android {
    defaultConfig {
        externalNativeBuild {
            cmake {
                arguments "-DANDROID_ARM_NEON=TRUE"
        }
    }
    externalNativeBuild {
        cmake {
            path file('CMakeLists.txt')
    }
}

Then, inference took just as long as before and I got the following error after inference finished:

A/libc: Fatal signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x753b9d3000 in tid 9935 (m.example.test2), pid 9935 (m.example.test2)

You can opt in to use the XNNPACK delegate, which actively uses ARM NEON optimized kernels if your CPU has it.

https://blog.tensorflow.org/2020/07/accelerating-tensorflow-lite-xnnpack-integration.html

It's much easier to enable XNNPACK with the Java / Obj-C / Swift APIs, by setting a boolean flag as explained in the blog post. If you need to use C++ directly for some reason, you could do something like this:

#include "tensorflow/lite/delegates/xnnpack/xnnpack_delegate.h"

// ...

TfLiteXNNPackDelegateOptions options = TfLiteXNNPackDelegateOptionsDefault();
// options.num_threads = <desired_num_threads>;
tflite::Interpreter::TfLiteDelegatePtr delegate(
    TfLiteXNNPackDelegateCreate(&options),
    [](TfLiteDelegate* delegate) { TfLiteXNNPackDelegateDelete(delegate); });
auto status = interpreter->ModifyGraphWithDelegate(std::move(delegate));
// check on the returned status code ...

See also how the Java API calls the C++ API internally to enable XNNPack delegate.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM