简体   繁体   English

如何将Parse ObjectId(String)转换为long?

[英]How to convert Parse ObjectId (String) to long?

Every object in Parse.com has your own ObjectId, that is a string with 10 char and apparently it is created by this regex: [0-9a-zA-Z]{10} . Parse.com中的每个对象都有自己的ObjectId,这是一个包含10个字符的字符串,显然它是由这个正则表达式创建的: [0-9a-zA-Z]{10}

Example of ObjectId in Parse: Parse中ObjectId的示例:

  • X12wEq4sFf X12wEq4sFf
  • Weg243d21s Weg243d21s
  • zwg34GdsWE zwg34GdsWE

I would like to convert this String to Long, because it will save memory and improve searching. 我想将此String转换为Long,因为它将节省内存并改善搜索。 (10 chars using UTF-8 has 40 bytes, and 1 long has 8 bytes) (使用UTF-8的10个字符有40个字节,1个长度有8个字节)

If we calculate the combinations, we can find: 如果我们计算组合,我们可以找到:

  • String ObjectId: 62^10 = 839299365868340224 different values; String ObjectId: 62 ^ 10 = 839299365868340224不同的值;
  • long: is 2^64 = 18446744073709551616 different values. long:是2 ^ 64 = 18446744073709551616不同的值。

So, we can convert these values without losing information. 因此,我们可以转换这些值而不会丢失信息。 There is a simple way to do it safely? 有一种简单的方法可以安全地完成它吗? Please, consider any kind of encoding for Chars (UTF-8, UTF-16, etc); 请考虑Chars的任何编码(UTF-8,UTF-16等);

EDIT: I am just thinking in a hard way to solved it. 编辑:我只是想以一种艰难的方式来解决它。 I am asking if there is an easy way. 我问是否有一个简单的方法。

  1. Your character set is a subset of the commonly-used Base64 encoding, so you could just use that. 您的字符集是常用Base64编码的子集,因此您可以使用它。 Java has the Base64 class, no need to roll your own codec for this. Java有Base64类,不需要为此编写自己的编解码器。
  2. Are you sure this is actually valuable? 你确定这实际上有价值吗? "because it will save memory and improve searching" seems like an untested assertion; “因为它会节省内存并改善搜索”似乎是一个未经测试的断言; saving a few bytes on the IDs may very well be offset by the added cost of encoding and decoding every time you want to use something. 在ID上保存几个字节可能会被每次想要使用某些东西时增加的编码和解码成本所抵消。

EDIT: Also, why are you using UTF-8 strings for guaranteed-ascii data? 编辑:另外,为什么你使用UTF-8字符串保证ascii数据? If you represent 10 char IDs as a byte[10] , that's just 10 bytes instead of 40 (ie much closer to the 8 for a long ). 如果你将10个字符ID表示为一个byte[10] ,那只是10个字节而不是40个字节(即 long更接近8个字节)。 And you don't need to do any fancy conversions. 而且您不需要进行任何花哨的转换。

Here's a straightforward solution using 6 bits to store a single character. 这是一个直接的解决方案,使用6位来存储单个字符。

public class Converter {

    private static final String CHARS = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"; 

    private static int convertChar(char c) {
        int ret = CHARS.indexOf( c );
        if (ret == -1)
            throw new IllegalArgumentException( "Invalid character encountered: "+c);
        return ret;
    }

    public static long convert(String s) {
        if (s.length() != 10)
            throw new IllegalArgumentException( "String length must be 10, was "+s.length() );
        long ret = 0;
        for (int i = 0; i < s.length(); i++) {
            ret = (ret << 6) + convertChar( s.charAt( i ));
        }
        return ret;
    }
}

I'll leave the conversion from long to String for you to implement, it's basically the same in reverse. 我将把转换从long转换为String来实现,它基本上是相反的。

Ps: If you really want to save space, don't use Long , it adds nothing compared to the primitive long except overhead. Ps:如果你真的想节省空间,不要使用Long ,除了开销之外,它与原始long相比没有任何增加。

Ps 2: Also note that you aren't really saving much with this conversion: storing the ASCII characters can be done in 10 bytes, while a long takes up 4. What you save here is mostly the overhead you'd get if you stored those 10 bytes in a byte array. Ps 2:另外请注意,这种转换并没有真正节省太多:存储ASCII字符可以用10个字节完成,而long占用4个。你在这里保存的内容主要是你存储的开销。字节数组中的那10个字节。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM