Replacing Certain Parts of a String Python

Question

I can not seem to solve this. I have many different strings, and they are always different. I need to replace the ends of them though, but they are always different lengths. Here is a example of a couple strings:

string1 = "thisisnumber1(111)"
string2 = "itsraining(22252)"
string3 = "fluffydog(3)"

Now when I print these out it will of course print the following:

thisisnumber1(111)
itsraining(22252)
fluffydog(3)

What I would like it to print though is the follow:

thisisnumber1
itsraining
fluffydog

I would like it to remove the part in the parentheses for each string, but I do not know how sense the lengths are always changing. Thank You

Answer 1

You can use str.rsplit for this:

>>> string1 = "thisisnumber1(111)"
>>> string2 = "itsraining(22252)"
>>> string3 = "fluffydog(3)"
>>>
>>> string1.rsplit("(")
['thisisnumber1', '111)']
>>> string1.rsplit("(")[0]
'thisisnumber1'
>>>
>>> string2.rsplit("(")
['itsraining', '22252)']
>>> string2.rsplit("(")[0]
'itsraining'
>>>
>>> string3.rsplit("(")
['fluffydog', '3)']
>>> string3.rsplit("(")[0]
'fluffydog'
>>>

str.rsplit splits the string from right-to-left rather than left-to-right like str.split . So, we split the string from right-to-left on ( and then retrieve the element at index 0 (the first element). This will be everything before the (...) at the end of each string.

Answer 2

Your other option is to use regular expressions, which can give you more precise control over what you want to get.

import re
regex = regex = r"(.+)\(\d+\)"

print re.match(regex, string1).groups()[0] #returns thisisnumber1
print re.match(regex, string2).groups()[0] #returns itsraining
print re.match(regex, string3).groups()[0] #returns fluffydog

Breakdown of what's happening:

regex = r"(.+)\\(\\d+\\)" is the regular expression, the formula for the string you're trying to find

.+ means match 1 or more character of any kind except newline

\\d+ means match 1 or more digit

\\( and \\) are the "(" and ")" characters

putting .+ in parentheses puts that string sequence in a group, meaning that group of characters is one that you want to be able to access later on. We don't put the sequence \\(\\d+\\) in a group because we don't care about those characters.

regex.match(regex, string1).groups() gives every substring in string1 that was part of a group. Since you only want 1 substring, you just access the 0th element.

There's a nice tutorial on regular expressions on Tutorial's Point here if you want to learn more.

Answer 3

Since you say in a comment:

"all that will be in the parentheses will be numbers"

so you'll always have digits between your parens, I'd recommend taking a look at removing them with the regular expression module:

import re

string1 = "thisisnumber1(111)"
string2 = "itsraining(22252)"
string3 = "fluffydog(3)"

strings = string1, string2, string3

for s in strings:
    s_replaced = re.sub(
        r'''
        \( # must escape the parens, since these are special characters in regex
        \d+ # one or more digits, 0-9
        \)
        ''', # this regular expression will be replaced by the next argument
        '', replace the above with an empty string
        s, # the string we're modifying
        re.VERBOSE) # verbose flag allows us to comment regex clearly
    print(s_replaced)

prints:

thisisnumber1
itsraining
fluffydog

Replacing Certain Parts of a String Python

Question

3 answers

solution1
4 2014-10-14 00:41:20

solution2
1 2014-10-14 00:59:33

solution3
0 2014-10-14 00:56:53

Replacing Certain Parts of a String Python

Question

3 answers

solution1 4 2014-10-14 00:41:20

solution2 1 2014-10-14 00:59:33

solution3 0 2014-10-14 00:56:53

solution1
4 2014-10-14 00:41:20

solution2
1 2014-10-14 00:59:33

solution3
0 2014-10-14 00:56:53