圖像比較性能Java

Question

我在下面有此代碼，但它根本沒有效率，它非常非常慢，我必須比較更多圖片才能花費更長的時間。

例如我有500張圖片，每個過程持續2分鍾，即500 x 2分鍾= 1000分鍾！

特異性是一旦有與所比較的圖片相同，將其移至另一個文件夾。 然后檢索其余文件以比較i ++

任何想法？

public static void main(String[] args) throws IOException {

    String PicturesFolderPath=null;
    String removedFolderPath=null;
    String pictureExtension=null;
    if(args.length>0) {
         PicturesFolderPath=args[0];
         removedFolderPath=args[1];
         pictureExtension=args[2];
    }


    if(StringUtils.isBlank(pictureExtension)) {
        pictureExtension="jpg";
    }

    if(StringUtils.isBlank(removedFolderPath)) {
        removedFolderPath=Paths.get(".").toAbsolutePath().normalize().toString()+"/removed";
    }

    if(StringUtils.isBlank(PicturesFolderPath)) {
        PicturesFolderPath=Paths.get(".").toAbsolutePath().normalize().toString();
    }

    System.out.println("path to find pictures folder "+PicturesFolderPath);
    System.out.println("path to find removed pictures folder "+removedFolderPath);

    Collection<File> fileList = FileUtils.listFiles(new File(PicturesFolderPath), new String[] { pictureExtension }, false);

    System.out.println("there is "+fileList.size()+" files founded with extention "+pictureExtension);

    Iterator<File> fileIterator=fileList.iterator();
    //Iterator<File> loopFileIterator=fileList.iterator();

    File dest=new File(removedFolderPath);

    while(fileIterator.hasNext()) {
        File file=fileIterator.next();

        System.out.println("process image :"+file.getName());

        //each new iteration we retrieve the files staying
        Collection<File> list = FileUtils.listFiles(new File(PicturesFolderPath), new String[] { pictureExtension }, false);
        for(File f:list) {
            if(compareImage(file,f) && !file.getName().equals(f.getName()) ) {
                String filename=file.getName();
                System.out.println("file :"+file.getName() +" equal to "+f.getName()+" and will be moved on removed folder");
                File existFile=new File(removedFolderPath+"/"+file.getName());
                    if(existFile.exists()) {
                        existFile.delete();
                    }
                    FileUtils.moveFileToDirectory(file, dest, false);
                    fileIterator.remove();
                    System.out.println("file :"+filename+" removed");
                    break;

                }           
        }

    }

}


 // This API will compare two image file //
// return true if both image files are equal else return false//**
public static boolean compareImage(File fileA, File fileB) {        
    try {
        // take buffer data from botm image files //
        BufferedImage biA = ImageIO.read(fileA);
        DataBuffer dbA = biA.getData().getDataBuffer();
        int sizeA = dbA.getSize();                      
        BufferedImage biB = ImageIO.read(fileB);
        DataBuffer dbB = biB.getData().getDataBuffer();
        int sizeB = dbB.getSize();
        // compare data-buffer objects //
        if(sizeA == sizeB) {
            for(int i=0; i<sizeA; i++) { 
                if(dbA.getElem(i) != dbB.getElem(i)) {
                    return false;
                }
            }
            return true;
        }
        else {
            return false;
        }
    } 
    catch (Exception e) { 
        e.printStackTrace();
        return  false;
    }
}

Answer 1

已經提到的答案應該對您有所幫助，因為考慮圖片的width和height應快速排除更多候選對。

但是，您仍然有一個大問題：對於每個新文件，您都讀取了所有舊文件。 比較的數量呈二次方增長，並且對每一步都執行ImageIO.read ，它一定很慢。

您需要一些指紋，可以非常快速地進行比較。 您不能對整個文件內容使用指紋識別（因為元數據受其影響），但是您可以單獨對圖像數據進行指紋識別。

只需遍歷文件的圖像數據（就像您一樣），然后計算該文件的MD5哈希即可。 例如，將其存儲為HashSet的String ，您將獲得非常快速的查找。

一些未經測試的代碼

對於您要比較的每個圖像文件，您都要進行計算（使用Guava的哈希值）

HashCode imageFingerprint(File file) {
    Hasher hasher = Hashing.md5().newHasher();
    BufferedImage image = ImageIO.read(file);
    DataBuffer buffer = image.getData().getDataBuffer();
    int size = buffer.getSize();
    for(int i=0; i<size; i++) {
        hasher.putInt(buffer.getElem(i));
    }
    return hasher.hash();
}

計算僅適用於圖像數據，就像compareImage中的compareImage一樣，因此元數據將被忽略。

您可以計算其所有文件的指紋並將其存儲在HashSet<HashCode> ，而不是在目錄中搜索重復的指紋。 對於新文件，您可以計算其指紋並在集中查找它。

圖像比較性能Java

問題描述

1 個解決方案

解決方案1
0 2018-07-25 00:58:20

一些未經測試的代碼

圖像比較性能Java

問題描述

1 個解決方案

解決方案1 0 2018-07-25 00:58:20

一些未經測試的代碼

解決方案1
0 2018-07-25 00:58:20