簡體   English   中英

從鍵以特定表達式開頭的 Map 中獲取所有值的最快方法

[英]Fastest way to get all values from a Map where the key starts with a certain expression

假設您有一個map<String, Object> myMap

給定表達式"some.string.*" ,我必須從myMap檢索其鍵以該表達式開頭的所有值。

我試圖避免for loop因為myMap將被賦予一組表達式,而不僅僅是一個表達式,並且對每個表達式使用for loop變得很麻煩。

什么是最快的方法來做到這一點?

如果您使用NavigableMap (例如TreeMap ),您可以利用底層樹數據結構的好處,並執行如下操作(復雜度為O(lg(N)) ):

public SortedMap<String, Object> getByPrefix( 
        NavigableMap<String, Object> myMap, 
        String prefix ) {
    return myMap.subMap( prefix, prefix + Character.MAX_VALUE );
}

更擴展的例子:

import java.util.NavigableMap;
import java.util.SortedMap;
import java.util.TreeMap;

public class Test {

    public static void main( String[] args ) {
        TreeMap<String, Object> myMap = new TreeMap<String, Object>();
        myMap.put( "111-hello", null );
        myMap.put( "111-world", null );
        myMap.put( "111-test", null );
        myMap.put( "111-java", null );

        myMap.put( "123-one", null );
        myMap.put( "123-two", null );
        myMap.put( "123--three", null );
        myMap.put( "123--four", null );

        myMap.put( "125-hello", null );
        myMap.put( "125--world", null );

        System.out.println( "111 \t" + getByPrefix( myMap, "111" ) );
        System.out.println( "123 \t" + getByPrefix( myMap, "123" ) );
        System.out.println( "123-- \t" + getByPrefix( myMap, "123--" ) );
        System.out.println( "12 \t" + getByPrefix( myMap, "12" ) );
    }

    private static SortedMap<String, Object> getByPrefix(
            NavigableMap<String, Object> myMap,
            String prefix ) {
        return myMap.subMap( prefix, prefix + Character.MAX_VALUE );
    }
}

輸出是:

111     {111-hello=null, 111-java=null, 111-test=null, 111-world=null}
123     {123--four=null, 123--three=null, 123-one=null, 123-two=null}
123--   {123--four=null, 123--three=null}
12      {123--four=null, 123--three=null, 123-one=null, 123-two=null, 125--world=null, 125-hello=null}

我最近寫了一個MapFilter來滿足這樣的需求。 您還可以過濾過濾后的地圖,這非常有用。

如果您的表達式具有諸如“some.byte”和“some.string”之類的公共詞根,那么首先通過公共詞根(在本例中為“some.”)進行過濾將為您節省大量時間。 有關一些簡單示例,請參閱main

請注意,對過濾后的地圖進行更改會更改基礎地圖。

public class MapFilter<T> implements Map<String, T> {

    // The enclosed map -- could also be a MapFilter.
    final private Map<String, T> map;

    // Use a TreeMap for predictable iteration order.
    // Store Map.Entry to reflect changes down into the underlying map.
    // The Key is the shortened string. The entry.key is the full string.
    final private Map<String, Map.Entry<String, T>> entries = new TreeMap<>();
    // The prefix they are looking for in this map.
    final private String prefix;

    public MapFilter(Map<String, T> map, String prefix) {
        // Store my backing map.
        this.map = map;
        // Record my prefix.
        this.prefix = prefix;
        // Build my entries.
        rebuildEntries();
    }

    public MapFilter(Map<String, T> map) {
        this(map, "");
    }

    private synchronized void rebuildEntries() {
        // Start empty.
        entries.clear();
        // Build my entry set.
        for (Map.Entry<String, T> e : map.entrySet()) {
            String key = e.getKey();
            // Retain each one that starts with the specified prefix.
            if (key.startsWith(prefix)) {
                // Key it on the remainder.
                String k = key.substring(prefix.length());
                // Entries k always contains the LAST occurrence if there are multiples.
                entries.put(k, e);
            }
        }

    }

    @Override
    public String toString() {
        return "MapFilter(" + prefix + ") of " + map + " containing " + entrySet();
    }

    // Constructor from a properties file.
    public MapFilter(Properties p, String prefix) {
        // Properties extends HashTable<Object,Object> so it implements Map.
        // I need Map<String,T> so I wrap it in a HashMap for simplicity.
        // Java-8 breaks if we use diamond inference.
        this(new HashMap<>((Map) p), prefix);
    }

    // Helper to fast filter the map.
    public MapFilter<T> filter(String prefix) {
        // Wrap me in a new filter.
        return new MapFilter<>(this, prefix);
    }

    // Count my entries.
    @Override
    public int size() {
        return entries.size();
    }

    // Are we empty.
    @Override
    public boolean isEmpty() {
        return entries.isEmpty();
    }

    // Is this key in me?
    @Override
    public boolean containsKey(Object key) {
        return entries.containsKey(key);
    }

    // Is this value in me.
    @Override
    public boolean containsValue(Object value) {
        // Walk the values.
        for (Map.Entry<String, T> e : entries.values()) {
            if (value.equals(e.getValue())) {
                // Its there!
                return true;
            }
        }
        return false;
    }

    // Get the referenced value - if present.
    @Override
    public T get(Object key) {
        return get(key, null);
    }

    // Get the referenced value - if present.
    public T get(Object key, T dflt) {
        Map.Entry<String, T> e = entries.get((String) key);
        return e != null ? e.getValue() : dflt;
    }

    // Add to the underlying map.
    @Override
    public T put(String key, T value) {
        T old = null;
        // Do I have an entry for it already?
        Map.Entry<String, T> entry = entries.get(key);
        // Was it already there?
        if (entry != null) {
            // Yes. Just update it.
            old = entry.setValue(value);
        } else {
            // Add it to the map.
            map.put(prefix + key, value);
            // Rebuild.
            rebuildEntries();
        }
        return old;
    }

    // Get rid of that one.
    @Override
    public T remove(Object key) {
        // Do I have an entry for it?
        Map.Entry<String, T> entry = entries.get((String) key);
        if (entry != null) {
            entries.remove(key);
            // Change the underlying map.
            return map.remove(prefix + key);
        }
        return null;
    }

    // Add all of them.
    @Override
    public void putAll(Map<? extends String, ? extends T> m) {
        for (Map.Entry<? extends String, ? extends T> e : m.entrySet()) {
            put(e.getKey(), e.getValue());
        }
    }

    // Clear everything out.
    @Override
    public void clear() {
        // Just remove mine.
        // This does not clear the underlying map - perhaps it should remove the filtered entries.
        for (String key : entries.keySet()) {
            map.remove(prefix + key);
        }
        entries.clear();
    }

    @Override
    public Set<String> keySet() {
        return entries.keySet();
    }

    @Override
    public Collection<T> values() {
        // Roll them all out into a new ArrayList.
        List<T> values = new ArrayList<>();
        for (Map.Entry<String, T> v : entries.values()) {
            values.add(v.getValue());
        }
        return values;
    }

    @Override
    public Set<Map.Entry<String, T>> entrySet() {
        // Roll them all out into a new TreeSet.
        Set<Map.Entry<String, T>> entrySet = new TreeSet<>();
        for (Map.Entry<String, Map.Entry<String, T>> v : entries.entrySet()) {
            entrySet.add(new Entry<>(v));
        }
        return entrySet;
    }

    /**
     * An entry.
     *
     * @param <T> The type of the value.
     */
    private static class Entry<T> implements Map.Entry<String, T>, Comparable<Entry<T>> {

        // Note that entry in the entry is an entry in the underlying map.

        private final Map.Entry<String, Map.Entry<String, T>> entry;

        Entry(Map.Entry<String, Map.Entry<String, T>> entry) {
            this.entry = entry;
        }

        @Override
        public String getKey() {
            return entry.getKey();
        }

        @Override
        public T getValue() {
            // Remember that the value is the entry in the underlying map.
            return entry.getValue().getValue();
        }

        @Override
        public T setValue(T newValue) {
            // Remember that the value is the entry in the underlying map.
            return entry.getValue().setValue(newValue);
        }

        @Override
        public boolean equals(Object o) {
            if (!(o instanceof Entry)) {
                return false;
            }
            Entry e = (Entry) o;
            return getKey().equals(e.getKey()) && getValue().equals(e.getValue());
        }

        @Override
        public int hashCode() {
            return getKey().hashCode() ^ getValue().hashCode();
        }

        @Override
        public String toString() {
            return getKey() + "=" + getValue();
        }

        @Override
        public int compareTo(Entry<T> o) {
            return getKey().compareTo(o.getKey());
        }

    }

    // Simple tests.
    public static void main(String[] args) {
        String[] samples = {
                "Some.For.Me",
                "Some.For.You",
                "Some.More",
                "Yet.More"};
        Map map = new HashMap();
        for (String s : samples) {
            map.put(s, s);
        }
        Map all = new MapFilter(map);
        Map some = new MapFilter(map, "Some.");
        Map someFor = new MapFilter(some, "For.");
        System.out.println("All: " + all);
        System.out.println("Some: " + some);
        System.out.println("Some.For: " + someFor);

        Properties props = new Properties();
        props.setProperty("namespace.prop1", "value1");
        props.setProperty("namespace.prop2", "value2");
        props.setProperty("namespace.iDontKnowThisNameAtCompileTime", "anothervalue");
        props.setProperty("someStuff.morestuff", "stuff");
        Map<String, String> filtered = new MapFilter(props, "namespace.");
        System.out.println("namespace props " + filtered);
    }

}

刪除所有不以所需前綴開頭的鍵:

yourMap.keySet().removeIf(key -> !key.startsWith(keyPrefix));

接受的答案在 99% 的情況下都有效,但問題在於細節。

具體來說,當映射具有以前綴開頭、后跟Character.MAX_VALUE后跟任何其他內容的鍵時,接受的答案不起作用。 對已接受答案發表的評論會產生一些小的改進,但仍不能涵蓋所有情況。

以下解決方案還使用NavigableMap來挑選給定鍵前綴的子地圖。 解決方案是subMapFrom()方法,技巧是不碰撞/增加前綴的最后一個字符,而是不是MAX_VALUE的最后一個字符,同時切斷所有尾隨MAX_VALUE s。 例如,如果前綴是“abc”,我們將它增加到“abd”。 但是,如果前綴是“ab”+ MAX_VALUE我們會刪除最后一個字符並代替前面的字符,結果是“ac”。

import static java.lang.Character.MAX_VALUE;

public class App
{
    public static void main(String[] args) {
        NavigableMap<String, String> map = new TreeMap<>();
        
        String[] keys = {
                "a",
                "b",
                "b" + MAX_VALUE,
                "b" + MAX_VALUE + "any",
                "c"
        };
        
        // Populate map
        Stream.of(keys).forEach(k -> map.put(k, ""));
        
        // For each key that starts with 'b', find the sub map
        Stream.of(keys).filter(s -> s.startsWith("b")).forEach(p -> {
            System.out.println("Looking for sub map using prefix \"" + p + "\".");
            
            // Always returns expected sub maps with no misses
            // [b, b￿, b￿any], [b￿, b￿any] and [b￿any]
            System.out.println("My solution: " +
                    subMapFrom(map, p).keySet());
            
            // WRONG! Prefix "b" misses "b￿any"
            System.out.println("SO answer:   " +
                    map.subMap(p, true, p + MAX_VALUE, true).keySet());
            
            // WRONG! Prefix "b￿" misses "b￿" and "b￿any"
            System.out.println("SO comment:  " +
                    map.subMap(p, true, tryIncrementLastChar(p), false).keySet());
            
            System.out.println();
        });
    }
    
    private static <V> NavigableMap<String, V> subMapFrom(
            NavigableMap<String, V> map, String keyPrefix)
    {
        final String fromKey = keyPrefix, toKey; // undefined
        
        // Alias
        String p = keyPrefix;
        
        if (p.isEmpty()) {
            // No need for a sub map
            return map;
        }
        
        // ("ab" + MAX_VALUE + MAX_VALUE + ...) returns index 1
        final int i = lastIndexOfNonMaxChar(p);
        
        if (i == -1) {
            // Prefix is all MAX_VALUE through and through, so grab rest of map
            return map.tailMap(p, true);
        }
        
        if (i < p.length() - 1) {
            // Target char for bumping is not last char; cut out the residue
            // ("ab" + MAX_VALUE + MAX_VALUE + ...) becomes "ab"
            p = p.substring(0, i + 1);
        }
        toKey = bumpChar(p, i);
        
        return map.subMap(fromKey, true, toKey, false);
    }
    
    private static int lastIndexOfNonMaxChar(String str) {
        int i = str.length();
        
        // Walk backwards, while we have a valid index
        while (--i >= 0) {
            if (str.charAt(i) < MAX_VALUE) {
                return i;
            }
        }
        
        return -1;
    }
    
    private static String bumpChar(String str, int pos) {
        assert !str.isEmpty();
        assert pos >= 0 && pos < str.length();
        
        final char c = str.charAt(pos);
        assert c < MAX_VALUE;
        
        StringBuilder b = new StringBuilder(str);
        b.setCharAt(pos, (char) (c + 1));
        return b.toString();
    }
    
    private static String tryIncrementLastChar(String p) {
        char l = p.charAt(p.length() - 1);
        return l == MAX_VALUE ?
                // Last character already max, do nothing
                p :
                // Bump last character
                p.substring(0, p.length() - 1) + ++l;
    }
}

輸出:

Looking for sub map using prefix "b".
My solution: [b, b￿, b￿any]
SO answer:   [b, b￿]
SO comment:  [b, b￿, b￿any]

Looking for sub map using prefix "b￿".
My solution: [b￿, b￿any]
SO answer:   [b￿, b￿any]
SO comment:  []

Looking for sub map using prefix "b￿any".
My solution: [b￿any]
SO answer:   [b￿any]
SO comment:  [b￿any]

也許應該補充一點,我還嘗試了各種其他方法,包括我在互聯網上其他地方找到的代碼。 所有這些都因產生不正確的結果或因各種異常而崩潰。

map 的鍵集沒有特殊結構,所以我認為您無論如何都必須檢查每個鍵。 所以你找不到比單個循環更快的方法......

我使用此代碼進行速度試驗:

public class KeyFinder {

    private static Random random = new Random();

    private interface Receiver {
        void receive(String value);
    }

    public static void main(String[] args) {
        for (int trials = 0; trials < 10; trials++) {
            doTrial();
        }
    }

    private static void doTrial() {

        final Map<String, String> map = new HashMap<String, String>();
        giveRandomElements(new Receiver() {
            public void receive(String value) {
                map.put(value, null);
            }
        }, 10000);

        final Set<String> expressions = new HashSet<String>();
        giveRandomElements(new Receiver() {
            public void receive(String value) {
                expressions.add(value);
            }
        }, 1000);

        int hits = 0;
        long start = System.currentTimeMillis();
        for (String expression : expressions) {
            for (String key : map.keySet()) {
                if (key.startsWith(expression)) {
                    hits++;
                }
            }
        }
        long stop = System.currentTimeMillis();
        System.out.printf("Found %s hits in %s ms\n", hits, stop - start);
    }

    private static void giveRandomElements(Receiver receiver, int count) {
        for (int i = 0; i < count; i++) {
            String value = String.valueOf(random.nextLong());
            receiver.receive(value);
        }

    }
}

輸出是:

Found 0 hits in 1649 ms
Found 0 hits in 1626 ms
Found 0 hits in 1389 ms
Found 0 hits in 1396 ms
Found 0 hits in 1417 ms
Found 0 hits in 1388 ms
Found 0 hits in 1377 ms
Found 0 hits in 1395 ms
Found 0 hits in 1399 ms
Found 0 hits in 1357 ms

這會計算 10000 個隨機密鑰中有多少個以 1000 個隨機字符串值中的任何一個開頭(10M 檢查)。

所以在簡單的雙核筆記本電腦上大約 1.4 秒; 這對你來說太慢了嗎?

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM