So what I know is that a HashSet has no real sorting capabilities like a SortedSet, however I stumbled upon this :
When I run the following code :
public static void main(String[] args) {
Set<String> collection = new HashSet<String>(2000);
String[] data = {"a", "c", "g", "f", "b", "f", "b", "d","q","r","d","m"};
for(String input: data)
{
collection.add(input);
}
System.out.println("Output: " + collection);
}
I get the following output : Output: [a, b, c, d, f, g, m, q, r]
Which is alphabetically sorted. Why is that? Since a HashSet is not a sorted set.
So I tried with a string of characters instead of a single character :
public static void main(String[] args) {
Set<String> collection = new HashSet<String>(2000);
String[] data = {"atjre", "crj", "gertj", "fertj", "berj"};
for(String input: data)
{
collection.add(input);
}
System.out.println("Output: " + collection);
}
And i get the following output : Output: [crj, atjre, fertj, gertj, berj]
Now they are not sorted anymore, any explanations for this? Or is this just a random coincidence?
HashSet
implements Set
interface. It means that there is no guarantee of order of elements.
This class implements the Set interface, backed by a hash table (actually a HashMap instance). It makes no guarantees as to the iteration order of the set; in particular, it does not guarantee that the order will remain constant over time. Source
Over the time after you adding, deleting few times you can see the difference.
However, "no guarantee of ordering" does not imply "guaranteed random ordering". Exact answer of your question is,
The
hashcode
-method of theString
class also comes into play here, for single characterString
s thehashcode
will just be theint
value of the onechar
in theString
. And sincechar
'sint
values are ordered alphabetically, so will the computedhashes
of singlechar
String
s.
As per the Java docs: https://docs.oracle.com/javase/7/docs/api/java/util/HashSet.html
It makes no guarantees as to the iteration order of the set; in particular, it does not guarantee that the order will remain constant over time.
What I think you are experiencing here is a hash-function distribution anomaly. The hash-function is used internally to give your strings an integer index. For 1-long strings there isn't much complexity. As you make your strings longer, your hash function has more to work with.
This stems back to the whole idea of a hash function: take a set of possible values, and map them as evenly as possible to a set of smaller values. It just so happens that the hash function has those strings mapped as it does. You would probably see the same thing with consecutive numbers. And you start to see them un-ordered once more data is introduced.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.