简体繁体 English

Java快速数据存储和检索

[英]Java Fast Data Storage & Retrieval

原文 2009-10-15 14:01:57 7 15 java

I need to store records into a persistant storage and retrieve it on demand. 我需要将记录存储到持久存储中并按需检索它。 The requirement is as follows: 要求如下：

Extremely fast retrieval and insertion 极快的检索和插入
Each record will have a unique key. 每条记录都有一个唯一的密钥。 This key will be used to retrieve the record 此密钥将用于检索记录
The data stored should be persistent ie should be available upon JVM restart 存储的数据应该是持久的，即应在JVM重启时可用
A separate process would move stale records to RDBMS once a day 一个单独的进程会每天一次将过时的记录移动到RDBMS

What do you guys think? 你们有什么感想？ I cannot use standard database because of latency issues. 由于延迟问题，我无法使用标准数据库。 Memory databases like HSQLDB/ H2 have performace contraints. 像HSQLDB / H2这样的内存数据库具有性能约束。 Moreover the records are simple string objects and do not qualify for SQL. 此外，记录是简单的字符串对象，不符合SQL条件。 I am thinking of some kind of flat file based solution. 我正在考虑某种基于平面文件的解决方案。 Any ideas? 有任何想法吗？ Any open source project? 任何开源项目？ I am sure, there must be someone who has solved this problem before. 我相信，必须有人在此之前解决了这个问题。

15 个解决方案

There are lot of diverse tools and methods, but I think none of them can shine in all of the requirements. 有许多不同的工具和方法，但我认为它们都不能满足所有要求。

For low latency , you can only rely on in-memory data access - disks are physically too slow (and SSDs too). 对于低延迟 ，您只能依赖内存数据访问 - 磁盘在物理上太慢（以及SSD也是如此）。 If data does not fit in the memory of a single machine, we have to distribute our data to more nodes summing up enough memory. 如果数据不适合单个机器的内存，我们必须将数据分配给更多节点，总结足够的内存。

For persistency , we have to write our data to disk after all. 为了保持持久性 ，我们必须将数据写入磁盘。 Supposing optimal organization this can be done as background activity, not affecting latency. 假设最佳组织，这可以作为后台活动来完成，而不会影响延迟。 However for reliability (failover, HA or whatever), disk operations can not be totally independent of the access methods: we have to wait for the disks when modifying data to make shure our operation will not disappear. 但是对于可靠性 （故障转移，HA或其他），磁盘操作不能完全独立于访问方法：我们必须在修改数据时等待磁盘以使shure我们的操作不会消失。 Concurrency also adds some complexity and latency. 并发性还增加了一些复杂性和延迟。

Data model is not restricting here: most of the methods support access based on a unique key. 数据模型不限制：大多数方法支持基于唯一键的访问。

We have to decide, 我们必须决定，

if data fits in the memory of one machine, or we have to find distributed solutions, 如果数据适合一台机器的内存，或者我们必须找到分布式解决方案，
if concurrency is an issue, or there are no parallel operations, 如果并发是一个问题，或者没有并行操作，
if reliability is strict, we can not loose modifications, or we can live with the fact that an unplanned crash would result in data loss. 如果可靠性严格，我们不能松动修改，或者我们可以忍受意外崩溃会导致数据丢失的事实。

Solutions might be 解决方案可能是

self implemented data structures using standard java library, files etc. may not be the best solution, because reliability and low latency require clever implementations and lots of testing, 使用标准java库，文件等自行实现的数据结构可能不是最佳解决方案，因为可靠性和低延迟需要巧妙的实现和大量的测试，
Traditional RDBMS s have flexible data model, durable, atomic and isolated operations, caching etc. - they actually know too much, and are mostly hard to distribute. 传统的RDBMS具有灵活的数据模型，持久的，原子的和隔离的操作，缓存等 - 它们实际上知道得太多，并且大多数难以分发。 That's why they are too slow, if you can not turn off the unwanted features, which is usually the case. 这就是为什么它们太慢，如果你不能关闭不需要的功能，这通常是这种情况。
NoSQL and key-value stores are good alternatives. NoSQL和键值存储是很好的选择。 These terms are quite vague, and cover lots of tools. 这些术语非常模糊，涵盖了大量工具。 Examples are 例子是
- BerkeleyDB or Kyoto Cabinet as one-machine persistent key-value stores (using B-trees): can be used if the data set is small enough to fit in the memory of one machine. BerkeleyDB或Kyoto Cabinet作为单机持久键值存储（使用B树）：如果数据集足够小以适合一台机器的内存，则可以使用它。
- Project Voldemort as a distributed key-value store: uses BerkeleyDB java edition inside, simple and distributed, Project Voldemort作为分布式键值存储：内部使用BerkeleyDB java版，简单且分布式，
- ScalienDB as a distributed key-value store: reliable, but not too slow for writes either. ScalienDB作为分布式键值存储：可靠但写入速度也不慢。
- MemcacheDB, Redis other caching databases with persistency, MemcacheDB，Redis其他具有持久性的缓存数据库，
- popular NoSQL systems like Cassandra, CouchDB, HBase etc: used mainly for big data. 流行的NoSQL系统，如Cassandra，CouchDB，HBase等：主要用于大数据。

A list of NoSQL tools can be found eg. 可以找到NoSQL工具列表，例如。 here . 在这里。

Voldemort's performance tests report sub-millisecond response times, and these can be achieved quite easily, however we have to be careful with the hardware too (like the network properties mentioned above). Voldemort的性能测试报告了亚毫秒的响应时间，这些可以很容易地实现，但是我们也必须小心硬件（如上面提到的网络属性）。

看看LinkedIn的Voldemort 。

If all the data fits in memory, MySQL can run in memory instead of from disk (MySQL Cluster, Hybrid Storage). 如果所有数据都适合内存，MySQL可以在内存而不是磁盘（MySQL Cluster，Hybrid Storage）中运行。 It can then handle storing itself to disk for you. 然后它可以为您处理将自己存储到磁盘。

像CouchDB这样的东西呢？

I would use a BlockingQueue for that. 我会使用BlockingQueue 。 Simple, and built into Java . 简单，内置于Java中 。
I do something similar using realtime data from Chicago Merchantile Exchange. 我使用芝加哥商品交易所的实时数据做类似的事情。
The data is sent to one place for realtime use... and to another place (via TCP), using a BlockingQueue (Producer/Consumer) to persist the data to a database (Oracle,H2). 数据被发送到一个地方供实时使用...并通过TCP发送到另一个地方（通过TCP），使用BlockingQueue（生产者/消费者）将数据持久保存到数据库（Oracle，H2）。
The Consumer uses a time delayed commit to avoid fdisk sync issues in the database . Consumer使用延迟提交来避免数据库中的fdisk同步问题 。
(H2 type databases are asyncronous commit by default and avoid that issue) I log the persisting in the Consumer to keep track of the queue size to be sure （默认情况下H2类型的数据库是异步提交并避免出现这个问题）我在Consumer中记录持久化以跟踪队列大小以确保
it is able to keep up with the Producer. 它能够跟上制作人的步伐。 Works pretty good for me. 对我来说效果很好。

MySQL with shards may be a good idea. 带分片的MySQL可能是个好主意。 However, it depends on what is the data volume, transactions per second and latency you need. 但是，它取决于您需要的数据量，每秒事务数和延迟。

In memory databases are also a good idea. 在内存数据库中也是一个好主意。 In fact MySQL provides memory-based tables as well. 事实上MySQL也提供了基于内存的表。

Would a Tuple space / JavaSpace work? Tuple space / JavaSpace会起作用吗？ Also check out other enterprise data fabrics like Oracle Coherence and Gemstone . 另请查看其他企业数据结构，如Oracle Coherence和Gemstone 。

Have you actually proved that using an out-of-process SQL database like MySQL or SQL Server is too slow, or is this an assumption? 您是否真的证明使用像MySQL或SQL Server这样的进程外SQL数据库太慢，或者这是一个假设？

You could use a SQL database approach in conjunction with an in-memory cache to ensure that retrievals do not hit the database at all. 您可以将SQL数据库方法与内存缓存结合使用，以确保检索根本不会访问数据库。 Despite the fact that the records are plaintext I would still advise using SQL over a flat file solution (eg using a text column in your table schema) as the RDBMS will perform optimisations that a file system cannot (eg caching recently accessed pages, etc). 尽管记录是纯文本的，但我仍然建议在平面文件解决方案中使用SQL（例如，在表模式中使用文本列），因为RDBMS将执行文件系统无法进行的优化（例如，缓存最近访问的页面等）。

However, without more information about your access patterns, expected throughput, etc. I can't provide much more in the way of suggestions. 但是，如果没有关于访问模式，预期吞吐量等的更多信息，我无法提供更多建议。

How much does it matter if you lose a record or two? 如果你失去一两个记录多少钱？ Where are they coming from? 他们来自哪里？ Do you have a transactional relationship with the source? 您与来源有交易关系吗？

If you have serious reliability requirements then I think you may need to be prepared to pay some DB Overhead. 如果您有严格的可靠性要求，那么我认为您可能需要准备支付一些DB开销。

Perhaps you could separate the persistence problem from the in-memory problem. 也许您可以将持久性问题与内存中的问题分开。 Use a pup-sub approach. 使用pup-sub方法。 One subscriber look after in-memory, the other persisting the data ready for subsequent startup? 一个用户在内存中寻找，另一个用户为后续启动准备好数据？

Distributed cahcing products such as WebSphere eXtreme Scale (no Java EE dependency) might be relevent if you can buy rather than build. 如果您可以购买而不是构建，那么分布式cahcing产品（如WebSphere eXtreme Scale （无Java EE依赖））可能会相关。

How bad would it be if you lose a couple of entries in case of a crash? 如果您在发生碰撞时丢失了几个条目，会有多糟糕？

If it isn't that bad the following approach might work for you: 如果不是那么糟糕，以下方法可能适合您：

Create flat files for each entry, name of file equals id. 为每个条目创建平面文件，文件名等于id。 Possible one file for a not so big number of consecutive entries. 对于没有那么多连续条目的可能的一个文件。

Make sure your controller has a good cache and/or use one of the existing caches implemented in Java. 确保您的控制器具有良好的缓存和/或使用Java中实现的现有缓存之一。

Talk to a file system expert how to make this really fast 与文件系统专家交谈如何快速实现这一目标

It is simple and it might be fast. 它很简单，也可能很快。 Of course you lose transactions including the ACID principles. 当然，您将失去包括ACID原则在内的交易。

If you are looking for a simple key-value store and don't need complex sql querying, Berkeley DB might be worth a look. 如果您正在寻找一个简单的键值存储，并且不需要复杂的SQL查询，那么Berkeley DB可能值得一看。

Another alternative is Tokyo Cabinet , a modern DBM implementation. 另一种选择是Tokyo Cabinet ，一种现代DBM实现。

Sub millisecond r/w means you cannot depend on disk, and you have to be careful about network latency. 亚毫秒的r / w意味着你不能依赖磁盘，你必须小心网络延迟。 Just forget about standard SQL based solutions, main-memory or not. 忘记基于标准SQL的解决方案，主存还是不存在。 In a ms, you cannot get more than 100 KByte over a GBit network. 在ms中，您不能在GBit网络上获得超过100 KB的空间。 Ask a telecom engineer, they are used to solving these kind of problems. 问一位电信工程师，他们习惯于解决这类问题。

MapDB provides highly performant HashMaps/TreeMaps that are persisted to disk. MapDB提供了高性能的HashMaps / TreeMaps，它们被持久化到磁盘上。 Its a single library that you can embed in your Java program. 它是一个可以嵌入Java程序的库。

Chronicle Map is a ConcurrentMap implementation which stores keys and values off-heap, in a memory-mapped file. Chronicle Map是一个ConcurrentMap实现，它将密钥和值存储在内存映射文件中。 So you have persistence on JVM restart. 所以你有持久的JVM重启。

ChronicleMap.get() is consistently faster than 1 us, sometimes as fast as 100 ns / operation. ChronicleMap.get()始终比1 us快，有时快到100 ns / operation。 It's the fastest solution in the class. 这是班上最快的解决方案。

Will all the records and keys you need fit in memory at once? 您需要的所有记录和密钥是否会同时存储在内存中？ If so, you could just use a HashMap<String,String>, since it's Serializable. 如果是这样，你可以使用HashMap <String，String>，因为它是Serializable。