[英]Advice on Converting a Relational Schema to Cassandra
I am hoping to get some suggestions on how to best approach converting a typical relational schema to Cassandra. 我希望就如何最好的方法将典型的关系模式转换为Cassandra提出一些建议。 The Relational Schema is:
关系架构为:
CREATE TABLE IF NOT EXISTS sales (
sale_id bigint(20) UNSIGNED NOT NULL
AUTO_INCREMENT,
create_time timestamp NOT NULL
DEFAULT ‘0000-00-00 00:00:00’,
account bigint(20) UNSIGNED NOT NULL DEFAULT ‘0’,
store char(25) NOT NULL DEFAULT ‘’,
product char(25) NOT NULL DEFAULT ‘’,
coupon char(18) NOT NULL DEFAULT ‘’,
amount decimal(8,2) NOT NULL,
PRIMARY KEY (sale_id),
KEY create_time (create_time) )
The Cassandra schema I've come up with is: 我提出的Cassandra模式是:
CREATE TABLE sales (
sale_id uuid,
create_time timestamp,
account text,
store int,
coupon text,
product text,
amount int,
PRIMARY KEY ((create_time, store), coupon))
(with indexes created on non-key columns I need to query) (在我需要查询的非关键列上创建索引)
Typical query is to get all sales by product by product/coupon/account/store over some time period. 典型的查询是在一段时间内按产品/优惠券/帐户/商店获取按产品列出的所有销售额。
Does this make sense? 这有意义吗?
Any suggestions on how this may be improved for reasonable read/write performance? 关于如何改善此性能以获得合理的读/写性能的任何建议?
Thanks in advance for any suggestions. 在此先感谢您的任何建议。
No, You want to model your Cassandra schema to answer each question to get the good performance. 不,您想要对Cassandra模式进行建模以回答每个问题,以获得良好的性能。 Let's say you want to find all (recent) sales by product with you want create your primary key as (
productID, created_time
) 假设您想按产品查找所有(最近)销售,并希望将主键创建为(
productID, created_time
)
If your application normally wants to search for products that are sold recently, then you want to order the cluster factor( created_time
in your example) as desc. 如果您的应用程序通常想要搜索最近出售的产品,那么您想按聚类(desc)的顺序来排序聚类因子(在示例中为
created_time
)。
Likewise you might duplicating your sales data in multiple column families. 同样,您可以在多个列族中复制销售数据。 Don't be scared to duplicate data while modeling in distributed environment.
在分布式环境中建模时,不要害怕重复数据。 You want to
de-normalize
and look forward to get your results from partition itself. 您想
de-normalize
并期待从分区本身获取结果。
Hope this helps. 希望这可以帮助。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.