简体   繁体   中英

Java: how to store sparse data efficiently

I have more than a 1 billion items with approximatelly 1000 columns (a matrix). But for 95% columns unique values ratio is less than a percent, so this data could be classified as sparse data .

What is an effient and prod-ready solution for storing such a data in Java?

Not sure if you've thought this through. If you really have billions of rows, even if you find a mechanism to store your sparse matrix efficiently you may well have problems holding that much data in memory anyway.

You could, however, use a simple map whose key is a Pair which holds the row and column for the datum.

public class Pair<P, Q> {

    public final P p;
    public final Q q;

    public Pair(P p, Q q) {
        this.p = p;
        this.q = q;
    }

    // TODO: Implement equals and hashCode.
}

class Datum {
}
// My sparse database.
Map<Pair<Integer, Integer>, Datum> data = new HashMap<>();

This would use close to minimal storage but does not necessarily solve your problem.

好吧,我认为HashTable这样做的最佳选择...... key-value对对于相同的value是有效的,即多个value的一个key

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM