简体   繁体   中英

String manipulation algorithm to find string greater than original string

I have few words(strings) like 'hefg','dhck','dkhc','lmno' which is to be converted to new words by swapping some or all the characters such that the new word is greater than the original word lexicographically also the new word is the least of all the words greater than the original word. for eg 'dhck' should output 'dhkc' and not 'kdhc' , 'dchk' or any other.

i have these inputs

hefg
dhck
dkhc
fedcbabcd

which should output

hegf
dhkc
hcdk
fedcbabdc

I have tried with this code in python it worked for all except 'dkhc' and 'fedcbabcd' . I have figured out that the first character in case of 'fedcbabcd' is the max so, it is not getting swapped.and Im getting "ValueError: min() arg is an empty sequence"

How can I modify the algorithm To fix the cases?

list1=['d','k','h','c']
list2=[]
maxVal=list1.index(max(list1))
for i in range(maxVal):
    temp=list1[maxVal]
    list1[maxVal]=list1[i-1]
    list1[i-1]=temp
    list2.append(''.join(list1))
print(min(list2))

You can try something like this:

  • iterate the characters in the string in reverse order
  • keep track of the characters you've already seen, and where you saw them
  • if you've seen a character larger than the curent character, swap it with the smallest larger character
  • sort all the characters after the that position to get the minimum string

Example code:

def next_word(word):
    word = list(word)
    seen = {}
    for i in range(len(word)-1, -1, -1):
        if any(x > word[i] for x in seen):
            x = min(x for x in seen if x > word[i])
            word[i], word[seen[x]] = word[seen[x]], word[i]
            return ''.join(word[:i+1] + sorted(word[i+1:]))
        if word[i] not in seen:
            seen[word[i]] = i

for word in ["hefg", "dhck", "dkhc", "fedcbabcd"]:
    print(word, next_word(word))

Result:

hefg hegf
dhck dhkc
dkhc hcdk
fedcbabcd fedcbabdc

The max character and its position doesn't influence the algorithm in the general case. For example, for 'fedcbabcd' , you could prepend an a or a z at the beginning of the string and it wouldn't change the fact that you need to swap the final two letters.

Consider the input 'dgfecba' . Here, the output is 'eabcdfg' . Why? Notice that the final six letters are sorted in decreasing order, so by changing anything there, you get a smaller string lexicographically, which is no good. It follows that you need to replace the initial 'd' . What should we put in its place? We want something greater than 'd' , but as small as possible, so 'e' . What about the remaining six letters? Again, we want a string that's as small as possible, so we sort the letters lexicographically: 'eabcdfg' .

So the algorithm is:

  • start at the back of the string (right end);
  • go left while the symbols keep increasing;
  • let i be the rightmost position where s[i] < s[i + 1] ; in our case, that's i = 0;
  • leave the symbols on position 0, 1, ..., i - 1 untouched;
  • find the position among i+1 ... n-1 containing the least symbol that's greater than s[i] ; call this position j ; in our case, j = 3;
  • swap s[i] and s[j] ; in our case, we obtain 'egfdcba' ;
  • reverse the string s[i+1] ... s[n-1] ; in our case, we obtain 'eabcdfg' .

Your problem can we reworded as finding the next lexicographical permutation of a string .

The algorithm in the above link is described as follow:

1) Find the longest non-increasing suffix

2) The number left of the suffix is our pivot

3) Find the right-most successor of the pivot in the suffix

4) Swap the successor and the pivot

5) Reverse the suffix

The above algorithm is especially interesting because it is O(n) .

Code

def next_lexicographical(word):
    word = list(word)

    # Find the pivot and the successor
    pivot = next(i for i in range(len(word) - 2, -1, -1) if word[i] < word[i+1])
    successor = next(i for i in range(len(word) - 1, pivot, -1) if word[i] > word[pivot])

    # Swap the pivot and the successor
    word[pivot], word[successor] = word[successor], word[pivot]

    # Reverse the suffix
    word[pivot+1:] = word[-1:pivot:-1]

    # Reform the word and return it
    return ''.join(word)

The above algorithm will raise a StopIteration exception if the word is already the last lexicographical permutation.

Example

words = [
    'hefg',
    'dhck',
    'dkhc',
    'fedcbabcd'
]

for word in words:
    print(next_lexicographical(word))

Output

hegf
dhkc
hcdk
fedcbabdc

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM