简体繁体 English

Ceph对象存储（RADOS）中对象的最佳大小

[英]The optimum size of objects in Ceph Object Storage (RADOS)

原文 2014-02-12 11:31:26 5 1 storage/ distributed/ ceph

It looks like that RADOS is best suited to be used as the storage backend for Ceph Block Storage and File System. 看起来RADOS最适合用作Ceph块存储和文件系统的存储后端。 But if i want to use the Object Storage itself: 但是，如果我想使用对象存储本身：

Is there an optimum object size which gives the best performance? 是否有最佳的物体尺寸，以提供最佳性能？
Is there a problem with a large number of small objects? 大量小物件有问题吗？
How big objects can get without making troubles? 如何在不制造麻烦的情况下获得多大的物品？

It would be great if you can share your experience. 如果你能分享你的经验会很棒。

1 个解决方案

There is no optimal size for objects in the object store, in fact this flexibility is one of the big benefits over fixed-size block stores. 对象存储中的对象没有最佳大小，事实上，这种灵活性是固定大小的块存储的一大优势。 Typically an application will use this flexibility to decompose its data models along convenient boundaries. 通常，应用程序将使用此灵活性沿着方便的边界分解其数据模型。 That said, if you are storing very small or very large objects, you should take into account some considerations. 也就是说，如果要存储非常小或非常大的对象，则应考虑一些注意事项。

Is there a problem with a large number of small objects? 大量小物件有问题吗？

There has never been a functional problem with small objects, though in the past it has been inefficient due to the way that objects are stored. 小对象从未出现过功能问题，但过去由于存储对象的方式效率低下。 However, in the next release of Ceph (Firefly) there is a way to use LevelDB as a backend, making small objects much more efficient. 但是，在下一个Ceph（Firefly）版本中，有一种方法可以将LevelDB用作后端，从而使小对象更加高效。

How big objects can get without making troubles? 如何在不制造麻烦的情况下获得多大的物品？

Assuming that you are using replication in RADOS (in contrast to the proposed object striping feature and the erasure coding backend) an object is replicated in its entirety to a set of physical storage nodes. 假设您在RADOS中使用复制（与建议的对象条带化功能和擦除编码后端相反），对象将完整复制到一组物理存储节点。 Thus, the size of an object has an inherent limitation in size based on the storage capacity of the physical nodes to which the object is replicated. 因此，基于对象被复制到的物理节点的存储容量，对象的大小在大小上具有固有的限制。

This mode of operation also alludes to the practical limitation that per-object I/O performance will correspond to the performance of the physical devices (data and journal drives). 这种操作模式还暗示了实际的限制，即每个对象的I / O性能将对应于物理设备（数据和日志驱动器）的性能。 This means that it is often useful to think of an object as a unit of I/O parallelism, although in practice many objects will map to the same set of devices. 这意味着将对象视为I / O并行度的单位通常很有用，尽管在实践中许多对象将映射到同一组设备。

This question will likely have a different answer for the erasure coded backend, and applications can always stripe large datasets across smaller objects. 对于擦除编码后端，此问题可能会有不同的答案，应用程序总是可以跨越较小的对象对大型数据集进行条带化。