简体繁体中英

Single data column vs multiple columns in Cassandra

原文 2021-03-10 11:01:14 0 3 java/ cassandra

I'm working on a project with an existing cassandra database. The schema looks like this:

partition key (big int)	clustering key1 (timestamp)	data (text)
1	2021-03-10 11:54:00.000	{a:"somedata", b:2, ...}

My question is: Is there any advantage storing data in a json string? Will it save some space?

Until now I discovered disadvantages only:

You cannot (easily) add/drop columns at runtime, since the application could override the json string column.
Parsing the json string is currently the bottleneck regarding performance.

3 answers

No, there is no real advantage to storing JSON as string in Cassandra unless the underlying data in the JSON is really schema-less. It will also not save space but in fact use more because each item has to have a key+value instead of just storing the value.

If you can, I would recommend mapping the keys to CQL columns so you can store the values natively and accessing the data is more flexible. Cheers!

Erick is spot-on-correct with his answer.

The only thing I'd add, would be that storing JSON blobs in a single column makes updates (even more) problematic. If you update a single JSON property, the whole column gets rewritten. Also the original JSON blob is still there ...just "obsoleted" until compaction runs. The only time that storing a JSON blob in a single column makes any sense, is if the properties don't change.

And I agree, mapping the keys to CQL columns is a much better option.

I don't disagree with the excellent and already accepted answer by @erick-ramirez.

However there is often a good case to be made for using frozen UDTs instead of separate columns for related data that is only ever going to be set and retrieved at the same time and will not be specifically filtered as part of your query.

The "frozen" part is important as it means less work for cassandra but does mean that you rewrite the whole value each update.

This can have a large performance boost over a large number of columns. The nice ScyllaDB people have a great post on that:

If You Care About Performance, Employ User Defined Types

(I know Scylla DB is not exactly Cassandra but I've seen multiple articles that say the same thing about Cassandra)

One downside is that you add work to the application layer and sometimes mapping complex UDTs to your Java types will be interesting.

Storing data as a blob vs columns in cassandra

How to split a single Dataset column to multiple columns

Cassandra column family for Composite Columns?

Cassandra batch query vs single insert performance

Format multiple data to insert in a single column in sqlite

Android: Multiple Column ListView from Single Data

How to check a single column of a query with multiple columns of subquery?

Split a single column into multiple columns in Spark using Java

Merging the data of two columns of the same table into single column

Single Keyspace or Multiple Keyspace in cassandra datamodel

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question Storing data as a blob vs columns in cassandra How to split a single Dataset column to multiple columns Cassandra column family for Composite Columns? Cassandra batch query vs single insert performance Format multiple data to insert in a single column in sqlite Android: Multiple Column ListView from Single Data How to check a single column of a query with multiple columns of subquery? Split a single column into multiple columns in Spark using Java Merging the data of two columns of the same table into single column Single Keyspace or Multiple Keyspace in cassandra datamodel

Related Tags

Single data column vs multiple columns in Cassandra

Question

3 answers

solution1
5 ACCPTED 2021-03-10 11:11:28

solution2
1 2021-03-10 14:16:18

solution3
0 2022-12-05 17:57:28

Single data column vs multiple columns in Cassandra

Question

3 answers

solution1 5 ACCPTED 2021-03-10 11:11:28

solution2 1 2021-03-10 14:16:18

solution3 0 2022-12-05 17:57:28

solution1
5 ACCPTED 2021-03-10 11:11:28

solution2
1 2021-03-10 14:16:18

solution3
0 2022-12-05 17:57:28