简体繁体 English

为什么MongoDB比SQL DB快得多的任何详细和特定原因？

[英]Any detailed and specific reasons for Why MongoDB is much faster than SQL DBs?

原文 2012-06-28 12:21:59 3 3 sql/ mongodb/ performance/ nosql

Ok, there are questions about Why Is MongoDB So Fast 好的，关于MongoDB为什么这么快有疑问

I appreciate those answers, however, they are quite general. 我很欣赏这些答案，但是它们很笼统。 Yes, I know: 是的我知道：

MongoDB is document-based, then why being document-based can lead to much higher speed? MongoDB是基于文档的，那么为什么基于文档可以提高速度呢？
MongoDB is noSQL, but why noSQL means higher performance? MongoDB是noSQL，但是为什么noSQL意味着更高的性能？
SQL does a lot more than MongoDB for consistency, ACID, etc, but I believe MongoDB is also doing something similar to keep data safe, maintain indexing, etc, right? SQL在一致性，ACID等方面比MongoDB要做的事情要多得多，但是我相信MongoDB还在做类似的事情来保持数据安全，维护索引等，对吗？

Ok, I write this question just in order to find out 好吧，我写这个问题只是为了找出答案

what are the detailed and specific reasons for MongoDB's high performance? MongoDB高性能的具体原因是什么？
What exactly SQL does, but MongoDB does not do, so it gains very high performance? SQL究竟是做什么的，而MongoDB却没有，所以它获得了很高的性能？
If an interviewer (a MongoDB and SQL expert) asks you "Why MongoDB is so fast" , how would you answer? 如果访问者（MongoDB和SQL专家）问您"Why MongoDB is so fast" ，您将如何回答？ Obviously just answering: "because MongoDB is noSQL" is not enough. 显然仅回答： "because MongoDB is noSQL"是不够的。

Thanks 谢谢

3 个解决方案

First, let's compare apples with apples: Reads and writes with MongoDB are like single reads and writes by primary key on a table with no non-clustered indexes in an RDBMS. 首先，让我们将苹果与苹果进行比较： MongoDB的读写就像是在RDBMS中没有主键的表上通过主键进行的单次读写。

So lets benchmark exactly that: http://mysqlha.blogspot.de/2010/09/mysql-versus-mongodb-yet-another-silly.html 因此，让我们精确地进行基准测试： http : //mysqlha.blogspot.de/2010/09/mysql-versus-mongodb-yet-another-silly.html

And it turns out, the speed difference in a fair comparison of exactly the same primitive operation is not big. 事实证明，在完全相同的原始操作的公平比较中，速度差异并不大。 In fact, MySQL is slightly faster. 实际上，MySQL稍快一些。 I'd say, they are equivalent. 我会说，它们是等效的。

Why? 为什么？ Because actually, both systems are doing similar things in this particular benchmark. 因为实际上，在该特定基准测试中，两个系统都在做类似的事情。 Returning a single row, searched by primary key, is actually not that much work. 通过主键搜索返回单行实际上并没有那么多工作。 It is a very fast operation. 这是一个非常快的操作。 I suspect that cross-process communication overheads are a big part of it. 我怀疑跨进程通信开销是其中很大的一部分。

My guess is, that the more tuned code in MySQL outweighs the slightly less systematic overheads of MongoDB (no logical locks and probably some other small things). 我的猜测是，MySQL中经过更优化的代码要比MongoDB的系统开销少一些（没有逻辑锁，可能还有其他一些小东西）。

This leads to an interesting conclusion: You can use MySQL like a document database and get excellent performance out of it. 这得出一个有趣的结论： 您可以将MySQL像文档数据库一样使用，并从中获得出色的性能。

If the interviewer said: "We don't care about documents or styles, we just need a much faster database, do you think we should use MySQL or MongoDB?", what would I answer? 如果面试官说：“我们不在乎文档或样式，我们只需要一个更快的数据库，您认为我们应该使用MySQL还是MongoDB？”，我该怎么回答？

I'd recommend to disregard performance for a moment and look at the relative strength of the two systems. 我建议暂时忽略性能，并查看两个系统的相对强度。 Things like scaling (way up) and replication come to mind for MongoDB. MongoDB想到了诸如扩展（扩展）和复制之类的事情。 For MySQL, there are a lot more features like rich queries, concurrency models, better tooling and maturity and lots more. 对于MySQL，还有很多功能，例如丰富的查询，并发模型，更好的工具和成熟度等等。

Basically, you can trade features for performance. 基本上，您可以以功能换取性能。 Are willing to do that? 愿意这样做吗？ That is a choice that cannot be made generally. 这是通常无法做出的选择。 If you opt for performance at any cost, consider tuning MySQL first before adding another technology. 如果您不惜一切代价选择性能，请考虑在添加其他技术之前先调整MySQL。

Here is what happens when a client retrieves a single row/document by primary key. 当客户通过主键检索单个行/文档时，将发生以下情况。 I'll annotate the differences between both systems: 我将注释两个系统之间的差异：

Client builds a binary command (same) 客户端建立一个二进制命令（相同）
Client sends it over TCP (same) 客户端通过TCP发送（相同）
Server parses the command (same) 服务器解析命令（相同）
Server accesses query plan from cache (SQL only, not MongoDB, not HandlerSocket) 服务器从缓存访问查询计划（仅SQL，不是MongoDB，不是HandlerSocket）
Server asks B-Tree component to access the row (same) 服务器要求B-Tree组件访问行（相同）
Server takes a physical readonly-lock on the B-Tree path leading to the row (same) 服务器对通往该行的B树路径进行物理只读锁定（相同）
Server takes a logical lock on the row (SQL only, not MongoDB, not HandlerSocket) 服务器对行进行逻辑锁定（仅SQL，不是MongoDB，不是HandlerSocket）
Server serializes the row and sends it over TCP (same) 服务器对行进行序列化并通过TCP发送（相同）
Client deserializes it (same) 客户端反序列化（相同）

There are only two additional steps for typical SQL-bases RDBMS'es. 对于典型的基于SQL的RDBMS，只有两个附加步骤。 That's why there isn't really a difference. 这就是为什么没有真正区别的原因。

In general, MySQL and MongoDB are quite similar in "durable" write performance on a single machine. 通常，MySQL和MongoDB在单台计算机上的“持久”写入性能非常相似。 Simple key/value lookups are almost the same... if you want to use MySQL that way. 简单的键/值查找几乎是相同的……如果您想以这种方式使用MySQL。 Document support is, obviously, a big productivity benefit and a big win for performance. 显然，文档支持可带来巨大的生产力收益和巨大的性能优势。

With automatic sharding... MongoDB is faster in indescribable ways. 使用自动分片... MongoDB以难以描述的方式更快。 Out of the box, with proper design, you can scale out almost linearly without building any logic into your code whatsoever. 开箱即用，通过适当的设计，您几乎可以线性扩展，而无需在代码中构建任何逻辑。

Read/write splitting is also built into almost every driver... which, most, are sponsored or developed by 10gen themselves. 几乎每个驱动程序都内置了读/写拆分功能，大多数由10gen自己赞助或开发。

I've scaled applications before and written read/write splitting code, distributed hashes for sharding, rebalancing jobs running continuously, and added gzip to mysql "document" stores. 我之前已经对应用程序进行了扩展，并编写了读/写拆分代码，用于分片的分布式哈希，重新平衡了连续运行的作业，并将gzip添加到了mysql“ document”存储中。 ugh. 啊。

It's faster because it's simple and focused. 它速度更快，因为它既简单又专注。 It's designed with all of this in mind. 设计时要考虑到所有这些。 Scale on commodity hardware is a priority. 优先考虑商品硬件的规模。 The priorities of a RDBMS are quite different. RDBMS的优先级完全不同。

by default, mongo doesn't do any indexing; 默认情况下，mongo不做任何索引； there are also no transactions. 也没有交易。 however, if you configure a mysql table to be un-indexed, and turn on autocommit, you won't see a huge speed difference. 但是，如果将mysql表配置为未编制索引，然后打开自动提交，则不会看到巨大的速度差异。 writing bits to disk just takes a certain amount of time. 将位写入磁盘仅需要一定的时间。

however, mongo is designed to easily scale. 但是，mongo旨在轻松扩展。 using shards, you can horizontally scale your writes and get much better performance without the complexity of master-master replication. 使用分片，您可以水平扩展写操作并获得更好的性能，而无需使用主-主复制的复杂性。 using replica sets, you can scale your reads horizontally. 使用副本集，您可以水平缩放阅读。 so, i would say that there's a systemic performance improvement, but each query is not necessarily faster. 因此，我想说这是系统性能的提高，但是每个查询不一定更快。