简体   繁体   English

将 String 更改为 Int 以便更好地对其进行排序是否合法?

[英]Is it legitimate to change a String to Int so that I can sort it better?

I have many Strings, for example ("a32ghS:SAD") and I need to sort them.我有很多字符串,例如 ("a32ghS:SAD"),我需要对它们进行排序。 Is it okay to get a integer value like this:像这样获得 integer 值是否可以:

String s = "a32ghS:SAD";
int l = 0;
for (int i = 0; i < s.length(); i++) {
    l += (int) s.charAt(i);
}

Is it okay to sort the Strings based on the integer l?是否可以根据 integer l 对字符串进行排序? Or should I sort them based on it's String?或者我应该根据它的字符串对它们进行排序吗?

Much depend on what you want to do.很大程度上取决于你想做什么。 :) :)

however, if you sort it based on string you'll come up performing O(NlogN) string2int conversions.但是,如果您根据字符串对其进行排序,您将执行 O(NlogN) string2int 转换。 Instead, if you convert your strings before sorting, you'll drop to only O(N) conversions.相反,如果您在排序之前转换字符串,您将只进行 O(N) 次转换。

Simply adding up the character values of each character will sort it incorrectly (assuming you want alphabetical).简单地将每个字符的字符值相加会导致错误排序(假设您想要按字母顺序排列)。 Consider the string "aZZZZ" , this will come after "b" with your code sample.考虑字符串"aZZZZ" ,这将在您的代码示例中出现在"b"之后。 You method will sort the strings by the sum of the character codes of the characters contained in the strings , not particularly useful.您的方法将根据字符串中包含的字符的字符代码的总和对字符串进行排序,这不是特别有用。

Assuming you want to sort alphabetically you should do it using the Java library method Collections.sort as the code is already written to do this.假设你想按字母顺序排序,你应该使用 Java 库方法Collections.sort来完成它,因为已经编写了代码来执行此操作。

ArrayList<String> list = new ArrayList<String>();

unsortList.add("cc");
unsortList.add("bb");
unsortList.add("dd");
unsortList.add("aa");

Collections.sort(list);

The way the typical alphabetical sort works is by comparing the ASCII character codes in a the first position and ordering them that way, if the characters are the same then the next character is considered and so on.典型的字母排序的工作方式是比较第一个 position 中的 ASCII 字符代码并以这种方式对它们进行排序,如果字符相同,则考虑下一个字符,依此类推。

You won't be able to beat this sort of performance unless you are sorting a particular way or you can exploit some knowledge about the strings that you know.除非您以特定方式排序,否则您将无法击败这种性能,或者您可以利用您所知道的关于字符串的一些知识。

That would make "a32ghS:SAD" and "S32gha:SAD" have the same integer representation.这将使"a32ghS:SAD""S32gha:SAD"具有相同的 integer 表示。 Plus you would have troubles converting back integers to strings (you'd have to use some map structure).另外,将整数转换回字符串会遇到麻烦(您必须使用一些 map 结构)。

So, the answer is just sort the strings, it's not that it's really slow operation (it depends on number of items, of course).所以,答案只是对字符串进行排序,并不是说它真的很慢(当然,这取决于项目的数量)。

no, because the position in the string matters (see answers above for that) but if you know the maximum length of your string, and if you do a bitwise shift on it after adding a character, it might be ok.不,因为字符串中的 position 很重要(请参阅上面的答案)但是如果您知道字符串的最大长度,并且如果在添加字符后对其进行位移,则可能没问题。

Keep in mind that String.compareTo is using the unicode values of each character in much the same way, but the compareTo method by default is case-sensitive.请记住,String.compareTo 以几乎相同的方式使用每个字符的 unicode 值,但 compareTo 方法默认区分大小写。

In the Cassandra database, they do something like that by default.在 Cassandra 数据库中,他们默认执行类似的操作。 However, to compute the integer, they compute a hash using murmur3.然而,为了计算 integer,他们使用 murmur3 计算了 hash。 A hash is similar to your simple sum, but you are not likely to find two strings with the same hash (they exist, it's just rare). hash 类似于您的简单总和,但您不太可能找到两个具有相同 hash 的字符串(它们存在,只是很少见)。

In that case, it is useful because you compute the hash once and search possibly millions of rows.在这种情况下,它很有用,因为您计算一次 hash 并可能搜索数百万行。 It makes it really fast because the hash allows for the search to be sharded (ie if you have 201 computers and use groups of 3 computers to save your data [for replication], then a database searching 10,000,000 rows means searching about 149,253 on one of these little clusters).它使它变得非常快,因为 hash 允许对搜索进行分片(即,如果您有 201 台计算机并使用 3 台计算机为一组来保存数据 [用于复制],那么搜索 10,000,000 行的数据库意味着在其中一台上搜索大约 149,253 行这些小簇)。

Note that as a result the strings are not sorted alphabetically.请注意,结果字符串未按字母顺序排序。

Now, to sort strings in memory, you probably want to just use a sort() with the strings themselves as the key.现在,要对 memory 中的字符串进行排序,您可能只想使用sort()并将字符串本身作为键。 The time to compute the hash, store it, the extra memory it uses, you're not likely to save anything.计算 hash 的时间,存储它,它使用的额外 memory,你不可能保存任何东西。 A standard sort will use a binary search, so that's a maximum of 10 to 11 iterations for 1,000,000 strings.标准排序将使用二进制搜索,因此对于 1,000,000 个字符串最多需要 10 到 11 次迭代。 It will be fast.它会很快。

In Java, if you need to attach data to the string, use a Map .在 Java 中,如果需要将数据附加到字符串中,请使用Map If you do not need any data, use a SortedSet .如果您不需要任何数据,请使用SortedSet

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何将此对象的int更改为String? - How can I change an int to a String for this object? 如何在字符串中添加未定义的数字字符数,那么我可以将字符串转换为int? - How can I add an undefined number of digit characters to a string, so then I can convert the string to an int? 如何重写for循环以将字符串更改为int / double? - How can I rewrite for loop to change string to int/double? 当我更改int值时,字符串中的int不变 - Int in string does not change when i change the int value 用 int 替换 String 中的字符,以便对其进行评估 - Java - Replace a character from a String with an int so it can be evaluated - Java 我怎样才能将arraylist对象(int,string,boolean“ false”)更改为(int,String,boolean“ true”)并将其返回到数组列表? - How can i change an arraylist object (int , string, boolean “false”) to (int,String,boolean“true”) and return it back to the array list? 如何将 int 值与多个字符串值进行比较。 所以它适用于数据库。 无法转换值 - How can i compare a int value to multiple string values. so it works on a database. Failed to transform value 字符串不能改变。 但是int,char可以改变 - String can't change. But int, char can change 如何更改txt文件中的数据(字符串)以便我可以对该数据进行数学运算? - How to change data(string) from a txt file so I can do mathematical operations on that data? 如何将int sort切换为String sort? - How to switch int sort to String sort?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM