Using ARM Neon acceleration with TFLite C++ API on Android

Question

I am trying to utilize Neon acceleration for TFLite inference on an Android device. While this appears to be well documented and straightforward for Java, I could use help in getting started with the C++ API. I am new to this, so my apologies if the answer is obvious.

The TensorFlow Lite library contains source for Neon in https://github.com/tensorflow/tensorflow/tree/master/tensorflow/lite/kernels , but I am wondering how and where to include it and use it.

The device specs are: Processor Octa-core, 2000 MHz, ARM Cortex-A75 and ARM Cortex-A53, 64-bit, 10 nm. CPU (2x2.0 GHz 360 Gold & 6x1.7 GHz Kryo 360 Silver); GPU Adreno 615.

What I've tried so far: I changed the build.gradle file from

android {
    externalNativeBuild {
        cmake {
            path file('CMakeLists.txt')
        }
}

to

android {
    defaultConfig {
        externalNativeBuild {
            cmake {
                arguments "-DANDROID_ARM_NEON=TRUE"
        }
    }
    externalNativeBuild {
        cmake {
            path file('CMakeLists.txt')
    }
}

Then, inference took just as long as before and I got the following error after inference finished:

A/libc: Fatal signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x753b9d3000 in tid 9935 (m.example.test2), pid 9935 (m.example.test2)

Answer 1

You can opt in to use the XNNPACK delegate, which actively uses ARM NEON optimized kernels if your CPU has it.

https://blog.tensorflow.org/2020/07/accelerating-tensorflow-lite-xnnpack-integration.html

It's much easier to enable XNNPACK with the Java / Obj-C / Swift APIs, by setting a boolean flag as explained in the blog post. If you need to use C++ directly for some reason, you could do something like this:

#include "tensorflow/lite/delegates/xnnpack/xnnpack_delegate.h"

// ...

TfLiteXNNPackDelegateOptions options = TfLiteXNNPackDelegateOptionsDefault();
// options.num_threads = <desired_num_threads>;
tflite::Interpreter::TfLiteDelegatePtr delegate(
    TfLiteXNNPackDelegateCreate(&options),
    [](TfLiteDelegate* delegate) { TfLiteXNNPackDelegateDelete(delegate); });
auto status = interpreter->ModifyGraphWithDelegate(std::move(delegate));
// check on the returned status code ...

See also how the Java API calls the C++ API internally to enable XNNPack delegate.

Using ARM Neon acceleration with TFLite C++ API on Android

Question

1 answers

solution1
0 2020-10-21 07:47:43

Using ARM Neon acceleration with TFLite C++ API on Android

Question

1 answers

solution1 0 2020-10-21 07:47:43

solution1
0 2020-10-21 07:47:43