简体   繁体   中英

Generate all possible strings optimization

I have to generate a list of every string possible (given a specific string to pick characters from), the first method I came up with is this one:

public String getRandomLetters(Integer size, String range){
    String word = "";
    Random r = new Random();

    for (int i = 0; i < size; i++) {
        word += range.charAt(r.nextInt(range.length()));
    } 

    return word;
}

Using like this: getRandomLetters(2, "abc");

This will return a string of 2 letters.

Now, with this given range I have to generate a list of random strings. This list of words must not have repeatable words, must start by 1 character and when out of possibilities just add another.

Im trying this:

public List<String> getListOfRandomWords(Integer listLenght){
    String range = "abc"; //abcdefghijklmnopqrstuvwxyz
    Integer rangeLenght = range.length();

    List<String> words = new ArrayList();
    int counter = 1;

    int possibilities = (int) Math.pow(Double.valueOf(rangeLenght), Double.valueOf(counter));

    String word = "";

    for (int i = 0; i < listLenght; i++) {
        word = getRandomLetters(counter, range);

        while (words.contains(word)) {
            if(words.size() == possibilities){
                counter++;
                possibilities += (int) Math.pow(Double.valueOf(rangeLenght), Double.valueOf(counter));
            }
            word = getRandomLetters(counter, range);
        }

        words.add(word);
    }

    return words;
}

To run this:

        List<String> testList = getListOfRandomWords(20);

        for (String block : testList) {
            System.out.println(block);
        }

Output:

c
b
a
cc
ca
aa
bb
ba
ac
bc
ab
cb
bcc
cca
bac
aab
bab
cbb
baa
bcb

The output is ok and method works properly, my only concern is while(randomLetters.contains(randomLetter)) , is this the most optimal way to check for duplicate values?

PD: I can't use a sum to make this easier and generate the strings in order, they must come in random order.

Using List<String> words = new ArrayList<>(); and words.contains(word) is definitely not the most optimal way to check for duplicate values. This is an O(N²) operation, since you have an O(N) search for each word, and you're looping over that one or more times for every word you add.

Set<String> words = new HashSet<>(); and words.contains(word) should be significantly faster.

Of course, if you want to maintain the order you've added the words in, instead of some random order, you'd want to use: Set<String> words = new LinkedHashSet<>(); .

Use return new ArrayList<>(words) to convert the returned value back to a List<String> .

The best way I have found to cycle through all permutations is Heap's algorithm .

I don't understand it completely, as I have not read the proof behind it, but it definitely works, and cycles through all permutations exactly once. Using a random permutation generator is never guaranteed to generate all permutations.

Random and Unique can't be satisfied together. So assuming you need random order of all permutations.

If length of the string is relatively small (so list of results fit in memory) you can generate all permutations first and than shuffle result. This would give pretty decent performance as both operations are linear on size of output..

To get code combine Smart way to generate permutation and combination of String and Random shuffling of an array

If you need just one random unique result at a time - your original code to generate random string is ok, but you should use Hashset or similar structure to find duplicates ( Fastest data structure for contains() in Java? ).

Note that generating all combination this way is not practical - after about 80% of combinations number of retries will grow prohibitively larger and will be close to number of all combinations by the end.

Alternatively you can convert random numbers to string directly:

  1. generate random number from 1 to (numberOfCharacters^LengthOfString),
  2. check if it has duplicate in Hashset - go back to 1
  3. add number to Hashset
  4. convert number to string (divide by numberOfCharacters till you get to 0 and use reminder on each step to select character)
    19 for "abc":
    19/3= 6, reminder 1 - add "b" to result 6/3 = 2, reminder 0 - add "a" to result 2 - add "c" to result = "bac"

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM