简体   繁体   中英

Python - how to multiply characters in string by number after character

Title, for example I want to make 'A3G3A' into 'AAAGGGA'. I have this so far:

if any(i.isdigit() for i in string):
    for i in range(0, len(string)):
        if string[i].isdigit():
             (i am lost after this)

Here's a simplistic approach:

string = 'A3G3A'

expanded = ''

for character in string:
    if character.isdigit():
        expanded += expanded[-1] * (int(character) - 1)
    else:
        expanded += character

print(expanded)

OUTPUT: AAAGGGA

It assumes valid input. It's limitation is that the repetition factor has to be a single digit, eg 2 - 9. If we want repetition factors greater than 9, we have to do slightly more parsing of the string:

from itertools import groupby

groups = groupby('DA10G3ABC', str.isdigit)

expanded = []

for is_numeric, characters in groups:

    if is_numeric:
        expanded.append(expanded[-1] * (int(''.join(characters)) - 1))
    else:
        expanded.extend(characters)

print(''.join(expanded))

OUTPUT: DAAAAAAAAAAGGGABC

Assuming that the format is always a letter followed by an integer, with the last integer possibly missing:

>>> from itertools import izip_longest
>>> s = 'A3G3A'
>>> ''.join(c*int(i) for c, i in izip_longest(*[iter(s)]*2, fillvalue=1))
'AAAGGGA'

Assuming that the format can be any substring followed by an integer, with the integer possibly longer than one digit and the last integer possibly missing:

>>> from itertools import izip_longest
>>> import re
>>> s = 'AB10GY3ABC'
>>> sp = re.split('(\d+)', s)
>>> ''.join(c*int(i) for c, i in izip_longest(*[iter(sp)]*2, fillvalue=1))
'ABABABABABABABABABABGYGYGYABC'

A minimal pure python code which manage all cases.

output = ''
n = ''
c = ''
for x in input + 'a':
    if x.isdigit():
        n += x
    else:
        if n == '': 
            n = '1'
        output = output + c*int(n)
        n = ''
        c = x

with input="WA5OUH2!10" , output is WAAAAAOUHH!!!!!!!!!! . +'a' is to enforce the good behaviour at the end, because output is delayed.

Another approach could be -

import re
input_string = 'A3G3A'
alphabets = re.findall('[A-Z]', input_string) # List of all alphabets - ['A', 'G', 'A']
digits = re.findall('[0-9]+', input_string) # List of all numbers - ['3', '3']
final_output = "".join([alphabets[i]*int(digits[i]) for i in range(0, len(alphabets)-1)]) + alphabets[-1] 
#  This expression repeats each letter by the number next to it ( Except for the last letter ), joins the list of strings into a single string, and appends the last character
#  final_output - 'AAAGGGA'

Explanation -

In [31]: alphabets # List of alphabets in the string
Out[31]: ['A', 'G', 'A']

In [32]: digits  # List of numbers in the string ( Including numbers more than one digit)
Out[32]: ['3', '3']

In [33]: list_of_strings = [alphabets[i]*int(digits[i]) for i in range(0, len(alphabets)-1)]  # List of strings after repetition

In [34]: list_of_strings
Out[34]: ['AAA', 'GGG']

In [35]: joined_string = "".join(list_of_strings) # Joined list of strings

In [36]: joined_string
Out[36]: 'AAAGGG'

In [38]: final_output = joined_string + input_string[-1] # Append last character of the string

In [39]: final_output
Out[39]: 'AAAGGGA'

using the * to repeat the characters:

assumption repeater range between [1,9]
 q = 'A3G3A' try: int(q[-1]) # check if it ends with digit except: q = q+'1' # repeat only once "".join([list(q)[i]*int(list(q)[i+1]) for i in range(0,len(q),2)]) 
string = 'A3G3A'
string.rjust(10, 'A')

Hope it helps. If you need more - please consider searching for python string padding

One line solution. Assuming numbers in the range [0, 9].

>>> s = 'A3G3A'
>>> s = ''.join(s[i] if not s[i].isdigit() else s[i-1]*(int(s[i])-1) for i in range(0, len(s)))
>>> print(s)
AAAGGGA

Embrace regex! This finds all occurrences of the pattern non-digit character followed by non-negative integer (any number of digits) and replaces that substring with that many of the character.

import re
re.sub(r'([^\d])(\d+)', lambda m: m.group(1) * int(m.group(2)), 'A3G3A')

This can be solved by numpy:

import numpy as np

x = 'A3G3A'

if not x[-1].isdigit():
    x += '1'

letters = list(x[::2])
times = list(map(int,x[1::2]))
lst = ''.join(np.repeat(letters, times))

#output
'AAAGGGA'

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM