简体   繁体   English

将非数字字符串转换为整数?

[英]Converting non-numeric String to Integer?

How can I convert a non-numeric String to an Integer? 如何将非数字字符串转换为整数?

I got for instance: 我得到了例如:

String unique = "FUBAR";

What's a good way to represent the String as an Integer with no collisions eg "FUBAR" should always be represented as the same number and shan't collide with any other String. 将字符串表示为没有冲突的整数是一种好方法,例如“FUBAR”应始终表示为相同的数字,并且不应与任何其他字符串冲突。 For instance, String a = "A"; 例如, String a = "A"; should be represented as the Integer 1 and so on, but what is a method that does this (preferrably for all unicode strings, but in my case ASCII values could be sufficient). 应该表示为整数1 ,依此类推,但是这样做的方法是什么(最好是对所有unicode字符串,但在我的情况下,ASCII值就足够了)。

This is impossible. 这是不可能的。 Think about it, an Integer can only be 32 bits. 想想看, Integer只能是32位。 So, by the pigeonhole principle, there must exist at least two strings that have the same Integer value no matter what technique you use for conversion. 因此,根据鸽子原则,无论您使用何种技术进行转换,都必须存在至少两个具有相同Integer值的字符串。 In reality, there are infinite with the same values... 实际上,有无限的相同价值观......

If you're just looking for an efficient mapping, then I suggest that you just use the int returned by hashCode() , which for reference is actually 31 bits. 如果你只是在寻找一个有效的映射,那么我建议你只使用hashCode()返回的int ,它实际上是31位。

You can map Strings to unique IDs using table. 您可以使用表将字符串映射到唯一ID。 There is not way to do this generically. 通常无法做到这一点。

final Map<String, Integer> map = new HashMap<>();
public int idFor(String s) {
    Integer id = map.get(s);
    if (id == null)
       map.put(s, id = map.size());
    return id;
}

Note: having unique id's doesn't guarantee no collisions in a hash collection. 注意:拥有唯一ID并不保证哈希集合中不会发生冲突。

http://vanillajava.blogspot.co.uk/2013/10/unique-hashcodes-is-not-enough-to-avoid.html http://vanillajava.blogspot.co.uk/2013/10/unique-hashcodes-is-not-enough-to-avoid.html

If you know the character set used in your strings, then you can think of the string as number with base other than 10. For example, hexadecimal numbers contain letters from A to F. 如果您知道字符串中使用的字符集,那么您可以将字符串视为基数不是10的数字。例如,十六进制数字包含从A到F的字母。

Therefore, if you know that your strings only contain letters from an 8-bit character set, you can treat the string as a 256-base number. 因此,如果您知道您的字符串只包含来自8位字符集的字母,则可以将该字符串视为256个基数。 In pseudo code this would be: 在伪代码中,这将是:

number n;
for each letter in string
    n = 256 * n + (letter's position in character set)

If your character set contains 65535 characters, then just multiply 'n' with that number on each step. 如果您的字符集包含65535个字符,那么只需在每一步中将'n'与该数字相乘即可。 But beware, the 32 bits of an integer will be easily overflown. 但要注意,整数的32位很容易溢出。 You probably need to use a type that can hold a larger number. 您可能需要使用可以容纳更大数字的类型。

private BigDecimal createBigDecimalFromString(String data)
{
    BigDecimal value = BigDecimal.ZERO;

    try
    {
        byte[] tmp = data.getBytes("UTF-8");
        int numBytes = tmp.length;
        for(int i = numBytes - 1; i >= 0; i--)
        {
            BigDecimal exponent = new BigDecimal(256).pow(i);
            value = value.add(exponent.multiply(new BigDecimal(tmp[i])));
        }
    }
    catch (UnsupportedEncodingException e)
    {
    }
    return value;
}

Regardless of the accepted answer, it is possible to represent any String as an Integer by computing that String's Gödelnumber, which is a unique product of prime numbers for every possible String. 无论接受的答案是什么,都可以通过计算String的Gödelnumber来表示任何String作为整数,这是每个可能的字符串的素数的唯一乘积。 With that being said it's quite impractical and slow to implement, also for most Strings you would need a BigInteger rather than a normal Integer and to decode a Gödelnumber into its corresponding String you need to have a defined Charset. 有人说它实现起来非常不切实际且执行起来很慢,对于大多数字符串你也需要一个BigInteger而不是一个普通的Integer,并且要将Gödelnumber解码成相应的String,你需要有一个定义的Charset。

Maybe a little bit late, but I'm going to give my 10 cents to simplify it (internally is similar to BigDecimal suggested by @Romain Hippeau ) 也许有点晚了,但是我要给我10美分来简化它(内部类似于@Romain Hippeau建议的BigDecimal)

public static BigInteger getNumberId(final String value) {
    return new BigInteger(value.getBytes(Charset.availableCharsets().get("UTF-8")));
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM