I setup cassandra with default configuration in clean AWS instance, and I insert 10000 columns into a row, each column has a 1MB data. I use this ruby(version 1.9.3) script:
10000.times do
key = rand(36**8).to_s(36)
value = rand(36**1024).to_s(36) * 1024
Cas_client.insert(TestColumnFamily,TestRow,{key=>value})
end
every time I run this script, it will crash:
/usr/local/lib/ruby/gems/1.9.1/gems/thrift-0.8.0/lib/thrift/transport/socket.rb:109:in `read': CassandraThrift::Cassandra::Client::TransportException from /usr/local/lib/ruby/gems/1.9.1/gems/thrift-0.8.0/lib/thrift/transport/base_transport.rb:87:in `read_all'
from /usr/local/lib/ruby/gems/1.9.1/gems/thrift-0.8.0/lib/thrift/transport/framed_transport.rb:104:in `read_frame'
from /usr/local/lib/ruby/gems/1.9.1/gems/thrift-0.8.0/lib/thrift/transport/framed_transport.rb:69:in `read_into_buffer'
from /usr/local/lib/ruby/gems/1.9.1/gems/thrift-0.8.0/lib/thrift/client.rb:45:in `read_message_begin'
from /usr/local/lib/ruby/gems/1.9.1/gems/thrift-0.8.0/lib/thrift/client.rb:45:in `receive_message'
from /usr/local/lib/ruby/gems/1.9.1/gems/cassandra-0.15.0/vendor/0.8/gen-rb/cassandra.rb:251:in `recv_batch_mutate'
from /usr/local/lib/ruby/gems/1.9.1/gems/cassandra-0.15.0/vendor/0.8/gen-rb/cassandra.rb:243:in `batch_mutate'
from /usr/local/lib/ruby/gems/1.9.1/gems/thrift_client-0.8.1/lib/thrift_client/abstract_thrift_client.rb:150:in `handled_proxy' from /usr/local/lib/ruby/gems/1.9.1/gems/thrift_client-0.8.1/lib/thrift_client/abstract_thrift_client.rb:60:in `batch_mutate'
from /usr/local/lib/ruby/gems/1.9.1/gems/cassandra-0.15.0/lib/cassandra/protocol.rb:7:in `_mutate'
from /usr/local/lib/ruby/gems/1.9.1/gems/cassandra-0.15.0/lib/cassandra/cassandra.rb:463:in `insert'
from a.rb:6:in `block in <main>'
from a.rb:3:in `times'
from a.rb:3:in `<main>'
yet cassandra performs normally, then I run another ruby script to get how many columns I have inserted:
p cas_client.count_columns(TestColumnFamily,TestRow)
this script crashed again, same error message. And cassandra process remain in 100% cpu usage.
AWS m1.xlarge type instance (15GB mem,800GB harddisk, 4cores cpu)
cassandra-1.1.2
ruby-1.9.3-p194
jdk-7u6-linux-x64
ruby-gems:
cassandra (0.15.0)
thrift (0.8.0)
thrift_client (0.8.1)
What is the problem?
10,000 columns at 1mb each is 10 gigs of data.
Cassandra rpc uses thrift, which requires that the entire return value from an rpc call must fit in memory, so trying to read all columns would require you to load a 10 gig thrift object into memory which is not practical, especially in ruby.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.