Cassandra integration with hadoop for read performance

Question

I am using Apache Cassandra for storing around 100 million records. There is one single node with the following specifications-

RAM-32GB, HDD-2TB, Intel quad core processor.

With cassandra there is a read performance problem. For some queries it takes around 40mins for giving the output. After searching for how to improve the read performance i came to know about the following factors-

Compaction strategy,compression techniques, key cache, increase the heap space, turning off the swap space for cassandra.

After doing these optimizations, the performance remains the same. After seraching, I came around for integrating Hadoop with cassandra.Is it the correct way to do the queries in cassandra or any other factors I am missing here?? Thanks.

Answer 1

It looks like you data model could be improved. 40 minutes is something impossible. I download all data from 6 million records (around 10gb) within few minutes. And think it because I convert data in the process of download and store them. Trivial selects must take milliseconds.

Did you build it on the base of queries that you must do ?

Cassandra integration with hadoop for read performance

Question

1 answers

solution1
0 2015-09-19 11:02:29

Cassandra integration with hadoop for read performance

Question

1 answers

solution1 0 2015-09-19 11:02:29

solution1
0 2015-09-19 11:02:29