简体繁体 English

高绩效发展

[英]High Performance Development

原文 2012-08-17 06:11:55 1 3 c#/ performance/ design-patterns

Background 背景

We have been working very hard to try come up with solutions for a "High Performance" application. 我们一直在努力尝试为“高性能”应用程序提出解决方案。 The application is basically a high throughput in-memory manager, with a sync back to disk. 该应用程序基本上是一个高吞吐量的内存管理器，具有同步回磁盘。 The "reads" and "writes" are tremendously high, around 3000 transactions a second. “读取”和“写入”非常高，每秒约3000次交易。 We try and do as much as possible in memory, but eventually the data gets stale and needs to be flushed to disk, and this is where a huge "bottleneck" ensues. 我们尽可能地在内存中尝试做，但最终数据变得陈旧，需要刷新到磁盘，这就是一个巨大的“瓶颈”。 The app is multi-threaded, with about 50 threads. 该应用程序是多线程的，有大约50个线程。 There is no IPC (inter-process comms) 没有IPC（进程间通信）

Attempts 尝试

We initially wrote this in Java, and it worked quite well, up until a certain load, the bottleneck was hit and it just couldn't keep up. 我们最初是用Java编写的，它运行得很好，直到一定的负载，瓶颈被击中而且它无法跟上。 Then we tried it in C#, and the same bottle-neck was reached. 然后我们在C＃中尝试了它，并且达到了相同的瓶颈。 We tried this with unmanaged code (C#), and though on initial tests was blindingly fast using MMF (Memory-map files), in production, reading was slow (are using Views). 我们尝试使用非托管代码（C＃），虽然在初始测试中使用MMF（内存映射文件）非常快，但在生产中，读取速度很慢（正在使用视图）。 We did try CouchBase, but we stumbled into problems surround high network utilization. 我们确实尝试过CouchBase，但我们偶然发现围绕高网络利用率的问题。 This might be poor configuration on our part! 这可能是我们的配置不佳！

Extra Info: In our Java attempt (non-MMF), our thread with the Queue of information that needs to get flushed to disk builds to the extent of being unable to keep up "writing" to disk. 额外信息：在我们的Java尝试（非MMF）中，我们的线程与需要刷新到磁盘的信息队列构建到无法跟上“写入”磁盘的程度。 In our C# Memory-Map File Approach, the problems is that READS are very slow, and the WRITES working perfectly. 在我们的C＃Memory-Map文件方法中，问题是READS非常慢，而且WRITES工作正常。 For some reason, the Views are slow! 由于某种原因，视图很慢！

Question 题

So the question is, situations where you intend of transferring massive amounts of data; 所以问题是，你打算传输大量数据的情况; can someone please assist with a possible approach or architectural design that might be able to assist? 有人可以协助一种可能有帮助的方法或建筑设计吗？ I know this seems a bit broad, but I think the specific nature of high performance, high throughput should narrow down the answers. 我知道这似乎有点宽泛，但我认为高性能，高吞吐量的具体性质应该缩小答案范围。

Can anyone vouch for using Couchbase, MongoDB or Cassandra at such a level? 任何人都可以保证在这样的水平上使用Couchbase，MongoDB或Cassandra吗？ Other ideas or solutions would be appreciated. 其他想法或解决方案将不胜感激。

3 个解决方案

First off, I would like to make clear that I have little (if any) experience building high-performance, scalable applications.. 首先，我想说清楚，我几乎没有（如果有的话）构建高性能，可扩展的应用程序。

Martin Fowler has a description of the LMAX architecture that allowed an application to process about 6 million orders per second on a single thread. Martin Fowler描述了LMAX架构，该架构允许应用程序在单个线程上每秒处理大约600万个订单。 I'm not sure it can help you (as you seemingly need to move alot of data), but maybe you can get some ideas from it: http://martinfowler.com/articles/lmax.html 我不确定它可以帮助你（因为你似乎需要移动很多数据），但也许你可以从中获得一些想法： http ： //martinfowler.com/articles/lmax.html

The architecture is based on Event Sourcing which is often used to provide (relatively) easy scalability. 该体系结构基于事件源，通常用于提供（相对）简单的可伸缩性。

Massive amounts of data and disk access. 大量数据和磁盘访问。 What kind of disk are we talking about? 我们在谈论什么样的磁盘？ HDDs tend to spend a lot of time moving the head around if you work with more than one file. 如果您使用多个文件，HDD往往会花费大量时间来移动头部。 (That shouldn't be a problem if you use SSDs, though.) Also, you should take advantage of the fact that memory-mapped files are managed in page-sized chunks. （但是，如果使用SSD，这应该不是问题。）此外，您应该利用内存映射文件以页面大小的块进行管理这一事实。 Data structures should be aligned to page boundaries, if possible. 如果可能，数据结构应与页面边界对齐。

But in any case, you must make sure you know what the bottleneck is. 但无论如何，你必须确保你知道瓶颈是什么。 Optimizing data structures wouldn't help much if you actually lose the time due to thread synchronization, for example. 例如，如果由于线程同步而实际上失去了时间，那么优化数据结构将无济于事。 And if you're using a HDD, page alignment might not help as much as stuffing everything into a single file somehow. 如果您正在使用硬盘驱动器，页面对齐可能无助于将所有内容整合到单个文件中。 So use appropriate tools to figure out which brakes are still holding you back. 因此，使用适当的工具来确定哪些制动器仍然阻碍您。

Using a general-purpose database implementation might not help you as much as you hope. 使用通用数据库实现可能对您没有多大帮助。 They are, after all, general-purpose. 毕竟，它们是通用的。 If performance really is that much of an issue, a special implementation with your requirements in mind might outperform these more general implementations. 如果性能确实是一个很大的问题，那么考虑到您的需求的特殊实现可能会胜过这些更通用的实现。

If you want fast avoid persistence and queues as much as possible for writes and use memory sores/ caching on reads. 如果你想尽可能快地避免持久性和队列的写入，并在读取时使用内存疮/缓存。

Language has little to do with it.\\ 语言与它没什么关系。