简体   繁体   English

Android opengl shader程序可将图像从相机复制到SSBO,以进行TF-lite GPU推理

[英]Android opengl shader program to copy image from camera to SSBO for TF-lite GPU Inference

Tensorflow lite gpu delegate documentation provides a faster method for running tflite inference using Opengl and SSBO in Android[3]. Tensorflow lite gpu委托文档提供了一种更快的方法,用于在Android [3]中使用Opengl和SSBO运行tflite推理。 The documentation provides sample code to create and bind a SSBO with a image already in GPU. 该文档提供了示例代码,用于创建SSBO并将其与GPU中已有的图像绑定。 How can we copy or convert an image from android live camera and copy it to SSBO using OpenGL shader code? 我们如何从android实时相机复制或转换图像,然后使用OpenGL着色器代码将其复制到SSBO? When we just dump CPU memory to a SSBO the performance becomes worse compared to the normal gpu delegate execution. 当我们仅将CPU内存转储到SSBO时,与正常的gpu委托执行相比,性能会变差。 So what is the proper or most efficient way to pass camera image to SSBO so as to make the tflite inference faster? 那么,将相机图像传递到SSBO以便更快地进行tflite推理的正确或最有效方法是什么?

In the following code we have tried to convert the camera frame to bitmap and then convert it to texture and finally copy it to SSBO. 在以下代码中,我们尝试将相机框架转换为位图,然后将其转换为纹理,最后将其复制到SSBO。 However this method is compratively slower than normal GPU delegate execution pipeline (where data is copied from CPU to GPU -overhead). 但是,此方法比正常的GPU委托执行管道(将数据从CPU复制到GPU的开销)要慢得多。 The aim is to reduce the CPU to GPU copying of image data by making the image data availabel in GPU memory and then passing it to the model. 目的是通过使图像数据在GPU内存中可用,然后将其传递给模型来减少图像数据从CPU到GPU的复制。 We are able to run the model[1] at 40-50 ms using standard GPU delegate inference mechanism; 使用标准的GPU委托推理机制,我们可以在40-50毫秒的时间内运行模型[1]; whereas it takes 90-100 ms using the aforesaid SSBO method (unoptimized). 而使用上述SSBO方法需要90-100毫秒(未优化)。 The above timing refers to the time for running interpreter.run() method in tensorflow lite. 以上时间是指在tensorflow lite中运行interpreter.run()方法的时间。 Also it looks like this SSBO mechanism only works with OpenGL ES 3.1 or higher. 而且看起来这种SSBO机制仅适用于OpenGL ES 3.1或更高版本。

The ideal use case (as suggested by tensorflow) is the following[2]: 理想的用例(由tensorflow建议)如下[2]:

  1. You get the camera input in the form of a surface texture. 您会以表面纹理的形式获得相机输入。
  2. Create an OpenGL shader storage buffer object (SSBO). 创建一个OpenGL着色器存储缓冲区对象(SSBO)。
  3. Use GPUDelegate.bindGlBufferToTensor() to associate that SSBO with the input tensor. 使用GPUDelegate.bindGlBufferToTensor()将该SSBO与输入张量关联。

  4. Write a small shader program to dump surface texture of [1] into that SSBO of [2] efficiently. 编写一个小的着色器程序,将[1]的表面纹理有效地转储到[2]的SSBO中。

  5. Run inference. 运行推断。

We are able to get camera frames as raw bytes or convert it into texture and even render it to GLSurface View. 我们能够以原始字节的形式获取相机帧,或者将其转换为纹理,甚至将其渲染到GLSurface视图。 But we are uanble to acheive the speedup as suggetsed by tensorflow. 但是,我们能够实现张量流建议的加速。

  1. https://github.com/tensorflow/tensorflow/issues/26297 https://github.com/tensorflow/tensorflow/issues/26297
  2. https://github.com/tensorflow/tensorflow/issues/25657#issuecomment-466489248 https://github.com/tensorflow/tensorflow/issues/25657#issuecomment-466489248
  3. https://www.tensorflow.org/lite/performance/gpu_advanced#android_2 https://www.tensorflow.org/lite/performance/gpu_advanced#android_2

Android Code: Android代码:

public int[] initializeShaderBuffer(){
        android.opengl.EGLContext eglContext = eglGetCurrentContext();
        int[] id = new int[1];
        GLES31.glGenBuffers(id.length, id, 0);
        GLES31.glBindBuffer(GL_SHADER_STORAGE_BUFFER, id[0]);
        GLES31.glBufferData(GL_SHADER_STORAGE_BUFFER, 257*257*3*4, null, GLES31.GL_STREAM_COPY);

        GLES31.glBindBuffer(GL_SHADER_STORAGE_BUFFER, 0);// unbind
        return id;
    }

@Override
    public void onSurfaceCreated(GL10 glUnused, EGLConfig config) {
.....
.....

mTextureDataHandle0 = TextureHelper.loadTexture(mActivityContext,
                R.drawable.srcim);//No error

}


@Override
    public void onDrawFrame(GL10 glUnused) {





        int inputSsboId = initializeShaderBuffer()[0];

        interpreter = new Interpreter(GLActivity.tfliteModel);

        Tensor inputTensor = interpreter.getInputTensor(0);
        GpuDelegate gpuDelegate = new GpuDelegate();
        gpuDelegate.bindGlBufferToTensor(inputTensor, inputSsboId);
        interpreter.modifyGraphWithDelegate(gpuDelegate);



final int computeShaderHandle = ShaderHelper.compileShader(
                GLES31.GL_COMPUTE_SHADER, fragmentShader);//No error
            mProgramHandle = ShaderHelper.createAndLinkProgram(vertexShaderHandle,
                    computeShaderHandle);//No error 

mTextureUniformHandle0 = GLES31.glGetUniformLocation(mProgramHandle,
            "u_Texture0");


/**
         * First texture map
         */
        // Set the active texture0 unit to texture unit 0.
        GLES31.glActiveTexture(GLES31.GL_TEXTURE0 );

        // Bind the texture to this unit.
        GLES31.glBindTexture(GLES31.GL_TEXTURE_2D, mTextureDataHandle0);

        // Tell the texture uniform sampler to use this texture in the shader by
        // binding to texture unit 0.
        GLES31.glUniform1i(mTextureUniformHandle0, 0);


        GLES31.glBindBufferRange(GL_SHADER_STORAGE_BUFFER, 1, inputSsboId, 0, 257*257*3*4);

        GLES31.glUseProgram(mProgramHandle);
        if(compute==1)//Always set to 1
            GLES31.glDispatchCompute(16,16,1);

        GLES31.glBindBuffer(GL_SHADER_STORAGE_BUFFER, 0);  // unbind
        GLES31.glBindTexture(GLES31.GL_TEXTURE_2D, 0);  // unbind


        //Tflite code ...


        byte [][] outputArray = new byte [1][66049];//size based on model output
        Log.d("GPU_CALL_RUN","DONE");
        long oms1=System.currentTimeMillis();
        interpreter.run(null,outputArray);

        long cms1=System.currentTimeMillis();
        Log.d("TIME_RUN_MODEL",""+(cms1-oms1));

        Log.d("OUTVAL", Arrays.deepToString(outputArray));

}

Compute Shader :- 计算着色器:-

#version 310 es
layout(local_size_x = 16, local_size_y = 16) in;
layout(binding = 0) uniform sampler2D u_Texture0;
layout(std430) buffer;
layout(binding = 1) buffer Output { float elements[]; } output_data;
void main() {
    ivec2 gid = ivec2(gl_GlobalInvocationID.xy);
    //if (gid.x >= 257 || gid.y >= 257) return;
    vec3 pixel = texelFetch(u_Texture0, gid, 0).xyz;
    int linear_index = 3 * (gid.y * 257 + gid.x);
    output_data.elements[linear_index + 0] = pixel.x;
    output_data.elements[linear_index + 1] = pixel.y;
    output_data.elements[linear_index + 2] = pixel.z;
}

There is no simple way to dump SurfaceTexture to SSBO directly. 没有简单的方法可以直接将SurfaceTexture转储到SSBO。 The simplest path would be SurfaceTexture -> GlTexture -> SSBO. 最简单的路径是SurfaceTexture-> GlTexture-> SSBO。 TFLite GPU team is also trying to introduce another API (bindGlTextureToTensor), but until that is there, here is a shader program I used for GlTexutre -> SSBO conversion: TFLite GPU团队也在尝试引入另一个API(bindGlTextureToTensor),但是直到那为止,这里是我用于GlTexutre-> SSBO转换的着色器程序:

    #version 310 es

    layout(local_size_x = 16, local_size_y = 16) in;
    layout(binding = 0) uniform sampler2D input_texture;
    layout(std430) buffer;
    layout(binding = 1) buffer Output { float elements[]; } output_data;

    void main() {
      ivec2 gid = ivec2(gl_GlobalInvocationID.xy);
      if (gid.x >= 224 || gid.y >= 224) return;
      vec3 pixel = texelFetch(input_texture, gid, 0).xyz;
      int linear_index = 3 * (gid.y * 224 + gid.x);
      output_data.elements[linear_index + 0] = pixel.x;
      output_data.elements[linear_index + 1] = pixel.y;
      output_data.elements[linear_index + 2] = pixel.z;
    }

Note that this was for MobileNet v1 of input tensor size 224x224x3. 请注意,这是针对输入张量大小为224x224x3的MobileNet v1。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 为 Android 和 iOS 使用 tf-lite C++ API - Use tf-lite C++ API for both Android and iOS Android OpenGL着色器-以1:1大小显示图像 - Android OpenGL shader - show image at 1:1 size 在OpenGL ES和Android中编译片段着色器程序时出错 - Error in Compiling Fragment Shader Program in OpenGL es , Android 在Android的OpenGL ES 2.0中与着色器程序一起使用绘制轮廓 - Draw outline using with shader program in OpenGL ES 2.0 on Android 适用于Android OpenGL ES的Shader - Shader for Android OpenGL ES Android-在相机顶部绘制OpenGL图像 - Android - Draw OpenGL image on top of Camera 在 Android 中使用 Tensorflow Lite model 运行推理 - Running an inference with Tensorflow Lite model in Android OpenGL ES 2.0的漫反射着色器:光线随着相机移动而变化(Android上的Vuforia) - Diffuse shader for OpenGL ES 2.0: Light changes with camera movement (Vuforia on Android) 是否可以像Vertex Shader和Fragment Shader一样生效到Android Camera Preview,并使用OpenGLES保存捕获的图像? - Is it Possible to give effect like Vertex Shader and Fragment Shader to the Android Camera Preview , and Save the Captured image with OpenGLES? OpenGL Shader可在台式机上编译,但不能在Android上编译 - OpenGL Shader Compiles on Desktop, But Not Android
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM