简体   繁体   中英

how to we define hbase rowkey so we get reords in optimize manner when millons of records in table

I have 30 millions of records into table but when tried to find one of records from there then it i will take to much time retrieve. Could you suggest me how I can I need to generate row-key in such a way so we can get fetch records fast.

Right now I have take auto increment Id of 1,2,3 like so on as row-key and what steps need to take to performance improvement. Let me know your concerns

generally when we come for performance to a SQL structured table, we follow some basic/general tuning like apply proper index to columns which are being used in query. apply proper logical partition or bucketing to table. give enough memory for buffer to do some complex operations.

when it comes to big data , and specially if you are using hadoop , then the real problems comes with context switching between hard disk and buffer. and context switching between different servers. you need to make sure how to reduce context switching to get better performance.

some NOTES :

use Explain Feature to know Query structure and try to improve performance.

if you are using integer row-key , then it is going to give best performance, but always create row-key/index at the beginning of table. because later performance killing.

When creating external tables in Hive / Impala against hbase tables, map the hbase row-key against a string column in Hive / Impala. If this is not done, row-key is not used in the query and entire table is scanned.

never use LIKE in row-key query , because it scans whole table. use BETWEEN or = , < , >=. If you are not using a filter against row-key column in your query, your row-key design may be wrong. The row key should be designed to contain the information you need to find specific subsets of data

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM