简体   繁体   中英

Efficiently remove strings that are contained within other strings in LinkedList

I have a simple LinkedList that contains strings.

LinkedList<String> list = new LinkedList<String>();
list.add("A, B, C, D");
list.add("R");
list.add("A");
list.add("C, D");

So, our LinkedList is: [ "A, B, C, D", "R", "A" ,"C, D" ]

As you can see, "A" and "C, D" are already contained in "A,B,C,D" .

What is the most efficient way to remove the contained strings?

First, you can use contains() method before adding new values (as long as you're adding single String every time, but you are not...).

Second, it seems like this "problem" can be easily avoided, if you will change the way you're adding the strings, or the LinkedList restriction..

Anyway, this is a simple method that might suite your need:

private  void deleteIfContains(LinkedList<String> list, String str) {
    Iterator<String> headIterator = list.iterator();
    HashMap<Integer, String> newValues = new HashMap<>();
    int index = 0;

    while (headIterator.hasNext()) {
        String headString = headIterator.next();

        if (headString.contains(str)) {
            headIterator.remove();
            //replace method won't handle ','..you will need to use regex for it
            newValues.put(index, headString.replace(str, ""));
        }
        index++;
    }

    //Avoid ConcurrentModificationException
    for (int i : newValues.keySet()) {
        list.add(i, newValues.get(i));
    }
}

I would suggest you use a Set instead but you would have to contain every letter in a single String variable (maybe you should use Character ?).

If you really want to stick to your own idea consider implementing your own Set . But first figure out what happens in that situation :

LinkedList<String> list = new LinkedList<String>();
list.add("A, B, C, D");
list.add("C, E");

C should be rejected but what about E ?

As @nikowis says the best solution depends on the problem definition.

If the values are the elements "A", "B", "C", "D", ... the more efficient solution (on computation time) can be to transform the list into a List> or a single Set.

If the values are "substring", for example "C, E" is ONE value (and not two "C" and "E") you can use a substring "Trie" ( https://en.wikipedia.org/wiki/Trie ). It can find very quickly the presence of the substring in the trie (O(log N) with N the length of the string to add).

Convert the csv-format string to string values. Then store them as set element. If method add() returns true, that means value is already present.

String[] values = csvStr1.split(",");
Set<String> hashSet = new HashSet<String>(Arrays.asList(values));

String[] values2 = csvStr2.split(",");
for (String value: values2 ) {
    if( hashSet.add(value) == true ) {
          //value already present. Ignore this or do whatever you want.
    }
} 

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM