简体   繁体   中英

Lowercase all HashMap keys

I 've run into a scenario where I want to lowercase all the keys of a HashMap (don't ask why, I just have to do this). The HashMap has some millions of entries.

At first, I thought I 'd just create a new Map, iterate over the entries of the map that is to be lowercased, and add the respective values. This task should run only once per day or something like that, so I thought I could bare this.

Map<String, Long> lowerCaseMap = new HashMap<>(myMap.size());
for (Map.Entry<String, Long> entry : myMap.entrySet()) {
   lowerCaseMap.put(entry.getKey().toLowerCase(), entry.getValue());
}

this, however, caused some OutOfMemory errors when my server was overloaded during this one time that I was about to copy the Map.

Now my question is, how can I accomplish this task with the smallest memory footprint?

Would removing each key after lowercased - added to the new Map help?

Could I utilize java8 streams to make this faster? (eg something like this)

Map<String, Long> lowerCaseMap = myMap.entrySet().parallelStream().collect(Collectors.toMap(entry -> entry.getKey().toLowerCase(), Map.Entry::getValue));

Update It seems that it's a Collections.unmodifiableMap so I don't have the option of

removing each key after lowercased - added to the new Map

Instead of using HashMap , you could try using a TreeMap with case-insensitive ordering. This would avoid the need to create a lower-case version of each key:

Map<String, Long> map = new TreeMap<>(String.CASE_INSENSITIVE_ORDER);
map.putAll(myMap);

Once you've constructed this map, put() and get() will behave case-insensitively, so you can save and fetch values using all-lowercase keys. Iterating over keys will return them in their original, possibly upper-case forms.

Here are some similar questions:

You cannot remove the entry while iterating over the map. You will have a ConcurentModificationException if you try to do this.

As the issue is an OutOfMemoryError, not a performance error, using parallel stream will not help either.

Despite some task on the Stream API will be done lately, this will still lead to have two maps in memory at some point so you will still have the issue.

To workaround it, I only saw two ways :

  • Give more memory to your process (by increasing -Xmx on the Java command line). Memory is cheap these days ;)
  • Split the map and work in chunks : for example you divide the size of the map by ten and you process one chunck at a time and delete the processed entries before processing the new chunk. By this instead of having two times the map in memory you will just have 1.1 times the map.

For the split algorithm, you can try someting like this using the Stream API :

Map<String, String> toMap = new HashMap<>();            
int chunk = fromMap.size() / 10;
for(int i = 1; i<= 10; i++){
    //process the chunk
    List<Entry<String, String>> subEntries = fromMap.entrySet().stream().limit(chunk)
        .collect(Collectors.toList());  

    for(Entry<String, String> entry : subEntries){
        toMap.put(entry.getKey().toLowerCase(), entry.getValue());
        fromMap.remove(entry.getKey());
    }
}

the concerns in the above answers are correct and you might need to reconsider changing the data structure you are using.

for me, I had a simple map I needed to change its keys to lower case

take a look at my snippet, its a trivial solution and bad at performance

private void convertAllFilterKeysToLowerCase() {
    HashSet keysToRemove = new HashSet();
    getFilters().keySet().forEach(o -> {
        if(!o.equals(((String) o).toLowerCase()))
            keysToRemove.add(o);
    });
    keysToRemove.forEach(o -> getFilters().put(((String) o).toLowerCase(), getFilters().remove(o)));
}

Not sure about the memory footprint. If using Kotlin, you can try the following.

val lowerCaseMap = myMap.mapKeys { it.key.toLowerCase() }

https://kotlinlang.org/api/latest/jvm/stdlib/kotlin.collections/map-keys.html

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM