简体   繁体   English

Java持久键值存储

[英]Java persistent key-value store

I know this may have been asked a zillion times before but I cannot seem to find the golden solution for my exact use case. 我知道这可能已被问过无数次,但我似乎无法找到我的确切用例的黄金解决方案。

I only have one data structure, a map where the key is a string. 我只有一个数据结构,一个地图,其中键是一个字符串。 The objects of the map are maps themselves but this time the values are simple objects/primitives such as string, int, double, etc. So a map of maps. 地图的对象是地图本身,但这次的值是简单的对象/基元,如string,int,double等。所以地图的地图。 The keys of the innermost map is constant, ie no entries are ever added/removed from the innermost map except when created. 最里面的地图的键是常量,即除了创建之外,没有从最里面的地图添加/删除任何条目。 So it is kind of like a traditional table, albeit each row may have arbitrary columns. 所以它有点像传统的表,尽管每行可能有任意列。

I need this data structure to be persistent and replicated. 我需要这个数据结构是持久的和复制的。

Here are my requirements: 这是我的要求:

  1. Pure Java solution 纯Java解决方案
  2. The disk map is only used in case of re-start. 磁盘映射仅在重新启动时使用。 Hence there are never any reads from disk and all the writing is only done by one application) 因此,从来没有任何从磁盘读取,所有的写入只能由一个应用程序完成)
  3. Embedded. 嵌入式。
  4. Performance. 性能。 It is the UPDATE performance of existing records that is important. 现有记录的UPDATE性能很重要。 UPDATEs will happen potentially 100k times per second (but more likely 20-50k per second). 更新可能每秒发生100k次(但更可能是每秒20-50k)。 As for INSERTs/DELETEs they do of course happen but probably only a few times per day. 至于INSERT / DELETE,它们当然会发生,但可能每天只发生几次。 Hence I do not worry too much about INSERT/DELETE performance. 因此,我不太担心INSERT / DELETE性能。
  5. Replicated. 复制。 For resilience I need the disk copy of the map to be replicated. 为了恢复,我需要复制地图的磁盘副本。 The replication from master to slave does not need to be part of the original transaction, ie I can sacrifice some ACIDness for performance. 从主服务器到从服务器的复制不需要是原始事务的一部分,即我可以牺牲一些ACIDness来提高性能。
  6. Number of records is expected to be 100k-200k, but not much higher. 记录数量预计为10万至20万,但不会高很多。 The size of each record is probably 100-200 KBytes so really not that much data in total. 每条记录的大小可能是100-200 KB,因此总的来说并不是那么多数据。 I'm guessing the total size of the data file will be below 100 MBytes and that is probably an estimate on the high side. 我猜测数据文件的总大小将低于100 MB,这可能是偏高的估计值。
  7. The total amount of data is not more than it can always fit in memory. 数据总量不超过内存总容量。 (this is why I can guarantee that there will be no disk reads, except during startup) (这就是为什么我可以保证没有磁盘读取,除了启动时)
  8. My application is not distributed. 我的申请没有分发。 At any given point in time there's only one active process that writes to disk. 在任何给定的时间点,只有一个活动进程写入磁盘。
  9. Liberal open source license. 自由开源许可证。 (Apache, BSD, LGPL, should be fine) (Apache,BSD,LGPL,应该没问题)

The application in question never needs to store anything but the above data structure, ie it will not have a future uncovered need for other persistent data structures. 有问题的应用程序永远不需要存储除上述数据结构之外的任何内容,即它不会具有其他持久性数据结构的未来发现需求。 Hence it sounds fair to optimize based on this particular data structure. 因此,基于这种特定的数据结构进行优化是合理的。

I've looked at Berkeley DB Java edition but it fails on requirement #6. 我查看了Berkeley DB Java版本,但它未能满足要求#6。 I've looked at TokyoCabinet/KoyotoCabinet but it fails on requirement #1. 我看过TokyoCabinet / KoyotoCabinet,但它没有达到要求#1。

So what would you recommend? 那么你会推荐什么?

There are several options, but neo4j seems to match what you want. 有几个选项,但neo4j似乎符合你的要求。 HBase and Cassandra are also options, but more than you probably need. HBaseCassandra也是选择,但比您可能需要的更多。

Have you looked at Redis ? 你看过Redis了吗? Its an in-memory "database" (key-value store) that, IMHO, meets all your needs. 它是一个内存中的“数据库”(键值存储),恕我直言,满足您的所有需求。

Have a look at HazelCast . 看看HazelCast It meets most of your requirements, except that it's distributed. 除了分发之外,它满足您的大多数要求。

I think Chronicle Map is a good match for your case 我认为Chronicle Map非常适合你的情况

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM