简体繁体中英

cassandra performance with very high number of columns per row

原文 2013-06-08 17:21:18 3 1 performance/ cassandra

I am considering storing data with number of columns reaching between 100-250 million per row with max 2-3k rows in a column family. I will be using composite columns to allow slicing the data and will limit the slice range to a reasonable value which can be handled within process memory limits.

One CF will have no column value just column names with 100-250 millon columns and other CF will have same number of columns but with approx 20-30kb data per column value.

I assume slicing does not require loading all column names etc to slice the data.

There will be 5% rows with such a high number of columns, rest will have 15-20 million max.

Anyone has tried with such a large volume of columns per row in Column Family and how was the performance...

If above works fine it saves me a great deal of work of managing multiple CFs.

Thanks

1 answers

I have worked on data of volumes close to those you have described. Range slice is not very fast but doesn't really get much slower when increasing data size, apart from the overhead cause cassandra has to return more columns. However, the fastest way to query would be if you knew all the keys you want to query in advance.

Your setup has almost no downside, as you are not using supercolumns and have flat data structure, which is what Cassandra is good for, after all, it is a key-value store.

Cassandra row with few columns read performance degradation

Performance issue with very high number of lists created in loop

Cassandra adding row vs. adding columns performance

Faster way of counting total number of columns in a cassandra row with hector

how to gain a high performance with a very big database

WebRTC performance - very high cpu load

High GTMetrix Scores, Yet Very Slow Performance?

Slow performance of Row_Number with multiple columns in partition by

Very high number of HTTP Threads in WAIT state

cassandra wide row column slice performance

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question Cassandra row with few columns read performance degradation Performance issue with very high number of lists created in loop Cassandra adding row vs. adding columns performance Faster way of counting total number of columns in a cassandra row with hector how to gain a high performance with a very big database WebRTC performance - very high cpu load High GTMetrix Scores, Yet Very Slow Performance? Slow performance of Row_Number with multiple columns in partition by Very high number of HTTP Threads in WAIT state cassandra wide row column slice performance

Related Tags

cassandra performance with very high number of columns per row

Question

1 answers

solution1 0 2013-06-08 23:01:31

solution1
0 2013-06-08 23:01:31