简体   繁体   English

如何在 CameraX 预览上设置一个框,以便使用 Java 中的 ImageAnalysis 对其进行处理?

[英]How to set a box on CameraX preview so as to processes it using ImageAnalysis in Java?

I have been working on an app which needed to use CameraX for it's preview stream but it also needs a box kind of overlay from which the text will be decoded.我一直在开发一个需要使用 CameraX 预览 stream 的应用程序,但它还需要一种用于解码文本的框式覆盖。 I have successfully implemented the preview but for can't seem to find a way to implement an overlay from which the text will be decoded without using any third party application.我已经成功实现了预览,但似乎无法找到一种方法来实现一个覆盖,在不使用任何第三方应用程序的情况下,文本将被解码。 Right now we can decode text from the entire screen.现在我们可以解码整个屏幕的文本。 I have seen a code that does just this in Codelabs turtorial ( link ) but it's in Kotlin and I can't decipher this complex Kotlin code.我在 Codelabs 教程( 链接)中看到了一个代码,但它在 Kotlin 中,我无法破译这个复杂的 Kotlin 代码。 If anyone can help me do this without using third party library,it would be great.如果有人可以在不使用第三方库的情况下帮助我做到这一点,那就太好了。 Thanks in advance.提前致谢。

my XML code:我的 XML 代码:


my camera logic:我的相机逻辑:

PreviewView mCameraView;
Camera camera;
void startCamera() {
  mCameraView = findViewById(R.id.previewView);

  cameraProviderFuture = ProcessCameraProvider.getInstance(this);

  cameraProviderFuture.addListener(() -> {
      try {
          ProcessCameraProvider cameraProvider = cameraProviderFuture.get();
      } catch (ExecutionException | InterruptedException e) {
          // No errors need to be handled for this Future.
          // This should never be reached.
  }, ContextCompat.getMainExecutor(this));

void bindPreview(@NonNull ProcessCameraProvider cameraProvider) {

  Preview preview = new Preview.Builder().

CameraSelector cameraSelector = new CameraSelector.Builder()


ImageAnalysis imageAnalysis = new ImageAnalysis.Builder()
            .setTargetResolution(new Size(4000, 5000))

imageAnalysis.setAnalyzer(executor, image -> {
        int rotationDegrees = degreesToFirebaseRotation(image.getImageInfo().getRotationDegrees());

        Image mediaImage = image.getImage();
        if (mediaImage == null) {

        FirebaseVisionImage firebaseVisionImage = FirebaseVisionImage.fromMediaImage(mediaImage, 

        FirebaseVisionTextRecognizer detector = 

                .addOnSuccessListener(firebaseVisionText -> {
                    // Task completed successfully
                    String text = firebaseVisionText.getText();
                    if (!text.isEmpty()) {
                        if (firstValidFrame == 0)
                            firstValidFrame = frames;
                        e -> {
                            Log.e("Error", e.toString());
camera = cameraProvider.bindToLifecycle(this, cameraSelector, preview);


private int degreesToFirebaseRotation(int degrees) {
  switch (degrees) {
      case 0:
          return FirebaseVisionImageMetadata.ROTATION_0;
      case 90:
          return FirebaseVisionImageMetadata.ROTATION_90;
      case 180:
          return FirebaseVisionImageMetadata.ROTATION_180;
      case 270:
          return FirebaseVisionImageMetadata.ROTATION_270;
          throw new IllegalArgumentException(
                  "Rotation must be 0, 90, 180, or 270.");

I found out how to do it and I wrote an article with a demo repo for those who are having the same problem that I had.我发现了如何做到这一点,并为那些与我遇到同样问题的人写了一篇带有演示 repo 的文章。 Here is the link: https://medium.com/@sdptd20/exploring-ocr-capabilities-of-ml-kit-using-camera-x-9949633af0fe这是链接: https://medium.com/@sdptd20/exploring-ocr-capabilities-of-ml-kit-using-camera-x-9949633af0fe

  1. So basically what I did was get the frames from the Camera X preview using Image Analysis.所以基本上我所做的就是使用图像分析从 Camera X 预览中获取帧。
  2. Then I created a surface view on top of the preview and drew a rectangle on it.然后我在预览顶部创建了一个表面视图并在其上绘制了一个矩形。
  3. Then I took the offset of the rectangle and cropped my bitmap according to that.然后我取矩形的偏移量,并据此裁剪我的 bitmap。
  4. And then I fed the bitmaps to the FirebaseImageAnalyzer and I got the text that's displayed in the bounding box only.然后我将位图输入 FirebaseImageAnalyzer,我得到了仅显示在边界框中的文本。

Here is the gist of the main activity: `以下是主要活动的要点:`

public class MainActivity extends AppCompatActivity implements SurfaceHolder.Callback {
    TextView textView;
    PreviewView mCameraView;
    SurfaceHolder holder;
    SurfaceView surfaceView;
    Canvas canvas;
    Paint paint;
    int cameraHeight, cameraWidth, xOffset, yOffset, boxWidth, boxHeight;

    private ListenableFuture<ProcessCameraProvider> cameraProviderFuture;
    private ExecutorService executor = Executors.newSingleThreadExecutor();

     *Responsible for converting the rotation degrees from CameraX into the one compatible with Firebase ML

    private int degreesToFirebaseRotation(int degrees) {
        switch (degrees) {
            case 0:
                return FirebaseVisionImageMetadata.ROTATION_0;
            case 90:
                return FirebaseVisionImageMetadata.ROTATION_90;
            case 180:
                return FirebaseVisionImageMetadata.ROTATION_180;
            case 270:
                return FirebaseVisionImageMetadata.ROTATION_270;
                throw new IllegalArgumentException(
                        "Rotation must be 0, 90, 180, or 270.");

     * Starting Camera
    void startCamera(){
        mCameraView = findViewById(R.id.previewView);

        cameraProviderFuture = ProcessCameraProvider.getInstance(this);

        cameraProviderFuture.addListener(new Runnable() {
            public void run() {
                try {
                    ProcessCameraProvider cameraProvider = cameraProviderFuture.get();
                } catch (ExecutionException | InterruptedException e) {
                    // No errors need to be handled for this Future.
                    // This should never be reached.
        }, ContextCompat.getMainExecutor(this));

     * Binding to camera
    private void bindPreview(ProcessCameraProvider cameraProvider) {
        Preview preview = new Preview.Builder()

        CameraSelector cameraSelector = new CameraSelector.Builder()


        //Image Analysis Function
        //Set static size according to your device or write a dynamic function for it
        ImageAnalysis imageAnalysis =
                new ImageAnalysis.Builder()
                        .setTargetResolution(new Size(720, 1488))

        imageAnalysis.setAnalyzer(executor, new ImageAnalysis.Analyzer() {
            public void analyze(@NonNull ImageProxy image) {
                //changing normal degrees into Firebase rotation
                int rotationDegrees = degreesToFirebaseRotation(image.getImageInfo().getRotationDegrees());
                if (image == null || image.getImage() == null) {
                //Getting a FirebaseVisionImage object using the Image object and rotationDegrees
                final Image mediaImage = image.getImage();
                FirebaseVisionImage images = FirebaseVisionImage.fromMediaImage(mediaImage, rotationDegrees);
                //Getting bitmap from FirebaseVisionImage Object
                Bitmap bmp=images.getBitmap();
                //Getting the values for cropping
                DisplayMetrics displaymetrics = new DisplayMetrics();
                int height = bmp.getHeight();
                int width = bmp.getWidth();

                int left, right, top, bottom, diameter;

                diameter = width;
                if (height < width) {
                    diameter = height;

                int offset = (int) (0.05 * diameter);
                diameter -= offset;

                left = width / 2 - diameter / 3;
                top = height / 2 - diameter / 3;
                right = width / 2 + diameter / 3;
                bottom = height / 2 + diameter / 3;

                xOffset = left;
                yOffset = top;

                //Creating new cropped bitmap
                Bitmap bitmap = Bitmap.createBitmap(bmp, left, top, boxWidth, boxHeight);
                //initializing FirebaseVisionTextRecognizer object
                FirebaseVisionTextRecognizer detector = FirebaseVision.getInstance()
                //Passing FirebaseVisionImage Object created from the cropped bitmap
                Task<FirebaseVisionText> result =  detector.processImage(FirebaseVisionImage.fromBitmap(bitmap))
                        .addOnSuccessListener(new OnSuccessListener<FirebaseVisionText>() {
                            public void onSuccess(FirebaseVisionText firebaseVisionText) {
                                // Task completed successfully
                                // ...
                                //getting decoded text
                                String text=firebaseVisionText.getText();
                                //Setting the decoded text in the texttview
                                //for getting blocks and line elements
                                for (FirebaseVisionText.TextBlock block: firebaseVisionText.getTextBlocks()) {
                                    String blockText = block.getText();
                                    for (FirebaseVisionText.Line line: block.getLines()) {
                                        String lineText = line.getText();
                                        for (FirebaseVisionText.Element element: line.getElements()) {
                                            String elementText = element.getText();

                                new OnFailureListener() {
                                    public void onFailure(@NonNull Exception e) {
                                        // Task failed with an exception
                                        // ...

        Camera camera = cameraProvider.bindToLifecycle((LifecycleOwner)this, cameraSelector, imageAnalysis,preview);

    protected void onCreate(Bundle savedInstanceState) {

        //Start Camera

        //Create the bounding box
        surfaceView = findViewById(R.id.overlay);
        holder = surfaceView.getHolder();


     * For drawing the rectangular box
    private void DrawFocusRect(int color) {
        DisplayMetrics displaymetrics = new DisplayMetrics();
        int height = mCameraView.getHeight();
        int width = mCameraView.getWidth();

        //cameraHeight = height;
        //cameraWidth = width;

        int left, right, top, bottom, diameter;

        diameter = width;
        if (height < width) {
            diameter = height;

        int offset = (int) (0.05 * diameter);
        diameter -= offset;

        canvas = holder.lockCanvas();
        canvas.drawColor(0, PorterDuff.Mode.CLEAR);
        //border's properties
        paint = new Paint();

        left = width / 2 - diameter / 3;
        top = height / 2 - diameter / 3;
        right = width / 2 + diameter / 3;
        bottom = height / 2 + diameter / 3;

        xOffset = left;
        yOffset = top;
        boxHeight = bottom - top;
        boxWidth = right - left;
        //Changing the value of x in diameter/x will change the size of the box ; inversely proportionate to x
        canvas.drawRect(left, top, right, bottom, paint);

     * Callback functions for the surface Holder

    public void surfaceCreated(SurfaceHolder holder) {


    public void surfaceChanged(SurfaceHolder holder, int format, int width, int height) {
        //Drawing rectangle

    public void surfaceDestroyed(SurfaceHolder holder) {


` `

Edit: I have found that you can use a png file with an image view instead of the surface view too.编辑:我发现您也可以将 png 文件与图像视图一起使用,而不是表面视图。 That may be cleaner and you can also integrate a customised layout for the users to superimpose on.这可能更干净,您还可以集成自定义布局供用户叠加。

Edit2: I have found that sending bitmap to image analyser can be inefficient ( was working on MLKit Barcode reader and it explicitly throws this warning in the logs) so what we can do is: Edit2:我发现将 bitmap 发送到图像分析器可能效率低下(正在使用 MLKit 条形码阅读器,它在日志中明确抛出此警告)所以我们可以做的是:


where imagePreview is the ImageProxy image and r is the "android.graphics.Rect".其中 imagePreview 是 ImageProxy 图像,r 是“android.graphics.Rect”。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

粤ICP备18138465号  © 2020-2024 STACKOOM.COM