简体   繁体   English

使用JAVA的单实例文件存储

[英]Single Instance File Storage with JAVA

I was storing some files based on a checksum but I found a flaw that 2 checksums can be identical sometimes. 我当时基于校验和存储了一些文件,但发现一个缺陷,有时2个校验和可以相同。

I always try looking for API instead of reinventing the wheel, but I can't find anything. 我总是尝试寻找API而不是重新发明轮子,但是我什么也找不到。

I know theres the JSR 268 and JackRabbit as a standard for content storage but my app is light-years of using such thing. 我知道有JSR 268和JackRabbit作为内容存储的标准,但是我的应用程序使用这种东西还很短。

So, are there approaches for single Instance File Storage with Java or should I just keep searching for new algorithms for my checksum? 那么,是否存在使用Java进行单实例文件存储的方法,还是应该继续为校验和寻找新的算法?

EDIT: 编辑:

When numcheck is not working: 2 files are exactly the same, just in different file system locations. 当numcheck不起作用时:2个文件完全相同,只是位于不同的文件系统位置。 However when sent from the client is impossible on server side to know the path they were before, so it is the same file twice, same checksum. 但是,从客户端发送时,在服务器端不可能知道它们之前的路径,因此它是同一文件两次,校验和相同。

If you wanna retrieve either one, how you check that? 如果您想找一个,如何检查?

Wanted to know if there was an standard approach, API, or an algorithm that could help me spot the difference 想知道是否有可以帮助我发现差异的标准方法,API或算法

No matter how strong a hashing algorithm is, there is always a chance of a collision . 不管哈希算法多么强大,总是有可能发生冲突 A hashing algorithm generates a finite number of hashes from an infinite number of inputs. 哈希算法从无限数量的输入中生成有限数量的哈希。

The only way to ensure that two files are not identical is to compare them bit by bit. 确保两个文件不相同的唯一方法是逐位比较它们。 Hashing them is easier and faster, but carries with it the risk of collision. 散列它们更容易,更快捷,但是会带来碰撞的风险。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM