how to replace the alphabetically smallest letter by 1, the next smallest by 2 but do not discard multiple occurrences of a letter?

Question

I am using Python 3 and I want to write a function that takes a string of all capital letters, so suppose s = 'VENEER' , and gives me the following output '614235' .

The function I have so far is:

def key2(s):
    new=''
    for ch in s:
        acc=0
        for temp in s:
            if temp<=ch:
                acc+=1
        new+=str(acc)
    return(new)

If s == 'VENEER' then new == '634335' . If s contains no duplicates, the code works perfectly.

I am stuck on how to edit the code to get the output stated in the beginning.

Answer 1

Note that the built-in method for replacing characters within a string, str.replace , takes a third argument; count . You can use this to your advantage, replacing only the first appearance of each letter (obviously once you replace the first 'E' , the second one will become the first appearance, and so on):

def process(s):
    for i, c in enumerate(sorted(s), 1):
##        print s # uncomment to see process
        s = s.replace(c, str(i), 1)
    return s

I have used the built-in functions sorted and enumerate to get the appropriate numbers to replace the characters:

1 2 3 4 5 6 # 'enumerate' from 1 -> 'i'
E E E N R V # 'sorted' input 's' -> 'c'

Example usage:

>>> process("VENEER")
'614235'

Answer 2

One way would be to use numpy.argsort to find the order, then find the ranks, and join them:

>>> s = 'VENEER'
>>> order = np.argsort(list(s))
>>> rank = np.argsort(order) + 1
>>> ''.join(map(str, rank))
'614235'

Answer 3

You can use a regex:

import re

s="VENEER"
for n, c in enumerate(sorted(s), 1):
    s=re.sub('%c' % c, '%i' % n, s, count=1)

print s
# 614235

You can also use several nested generators:

def indexes(seq):
    for v, i in sorted((v, i) for (i, v) in enumerate(seq)):
        yield i

print ''.join('%i' % (e+1) for e in indexes(indexes(s)))
# 614235

Answer 4

From your title, you may want to do like this?

>>> from collections import OrderedDict
>>> s='VENEER'
>>> d = {k: n for n, k in enumerate(OrderedDict.fromkeys(sorted(s)), 1)}
>>> "".join(map(lambda k: str(d[k]), s))
'412113'

As @jonrsharpe commented I didn't need to use OrderedDict .

Answer 5

def caps_to_nums(in_string):
    indexed_replaced_string = [(idx, val) for val, (idx, ch) in enumerate(sorted(enumerate(in_string), key=lambda x: x[1]), 1)]
    return ''.join(map(lambda x: str(x[1]), sorted(indexed_replaced_string)))

First we run enumerate to be able to save the natural sort order

enumerate("VENEER") -> [(0, 'V'), (1, 'E'), (2, 'N'), (3, 'E'), (4, 'E'), (5, 'R')]
# this gives us somewhere to RETURN to later.

Then we sort that according to its second element, which is alphabetical, and run enumerate again with a start value of 1 to get the replacement value. We throw away the alpha value, since it's not needed anymore.

[(idx, val) for val, (idx, ch) in enumerate(sorted([(0, 'V'), (1, 'E'), ...], key = lambda x: x[1]), start=1)]
# [(1, 1), (3, 2), (4, 3), (2, 4), (5, 5), (0, 6)]

Then map the second element (our value) sorting by the first element (the original index)

map(lambda x: str(x[1]), sorted(replacement_values)

and str.join it

''.join(that_mapping)

Ta-da!

how to replace the alphabetically smallest letter by 1, the next smallest by 2 but do not discard multiple occurrences of a letter?

Question

5 answers

solution1
4 2014-07-19 21:55:44

solution2
1 2014-07-19 21:42:35

solution3
1 2014-07-19 22:28:33

solution4
0 2014-07-19 22:12:32

solution5
-1 2014-07-19 21:51:26

how to replace the alphabetically smallest letter by 1, the next smallest by 2 but do not discard multiple occurrences of a letter?

Question

5 answers

solution1 4 2014-07-19 21:55:44

solution2 1 2014-07-19 21:42:35

solution3 1 2014-07-19 22:28:33

solution4 0 2014-07-19 22:12:32

solution5 -1 2014-07-19 21:51:26

solution1
4 2014-07-19 21:55:44

solution2
1 2014-07-19 21:42:35

solution3
1 2014-07-19 22:28:33

solution4
0 2014-07-19 22:12:32

solution5
-1 2014-07-19 21:51:26