简体   繁体   中英

Does HBase scan returns sorted columns?

I am working on a HBase map reduce job and need to understand if the columns in a single column family are returned sorted by their names (key). If so, I wouldnt need to do it in the shuffle sort stage.

Thanks

I have a very similar data model as you. Upon insertion however, I set my own values for the timestamps on the Put object. However, I did so in a way that took a "seed" of the current time and appended a incrementing counter for each event I persisted in the batch.

When I pulled the results out from the Scan, I wrote a comparator:

public class KVTimestampComparator implements Comparator<KeyValue> {

    @Override
    public int compare(KeyValue kv1, KeyValue kv2) {
        Long kv1Timestamp = kv1.getTimestamp();
        Long kv2Timestamp = kv2.getTimestamp();

        return kv1Timestamp.compareTo(kv2Timestamp);
    }
}

Then sorted the raw row:

List<KeyValue> row = Arrays.asList(result.raw());
Collections.sort(row, new KVTimestampComparator());

Got this idea from person who answered this: Sorted results from hbase scanner

no, columns are not sorted They are stored internally as key-value pairs in a long byte array. But, you should clarify your question about what you actually need this for.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM