What is the best way to get rid of common substring prefix in Python3?

Question

Let's assume we have string and a list of strings:

String:

str1 = <common-part>

List of strings:

[<common-part>-<random-text-a>, <common-part>-<random-text-b>]

What is the best (in case of readability and code-purity) to get such a list:

[<random-text-a>, <random-text-b>]

Answer 1

I would compute the common prefix of all strings using os.path.commonprefix , then slice the strings to remove that prefix (this function is in os.path module but doesn't check path separators, it's useable in a generic context):

import os

p = ["<common-part>-<some-text-a>", "<common-part>-<random-text-b>"]
commonprefix = os.path.commonprefix(p)

new_p = [x[len(commonprefix):] for x in p]

print(new_p)

result (since commonprefix is ""<common-part>-<" ):

['some-text-a>', 'random-text-b>']

notes:

this method allows a full dynamic prefix, not known in advance. With reversing the strings, it's also possible to remove the common suffix.
it's better to use len to slice the result instead of str.replace() : it's faster, and it only removes the start of the string, and safe since we know that all strings start by this prefix.

Answer 2

You can use list comprehensions, which are pretty pythonic:

[newstr.replace(str1, '', 1) for newstr in list_of_strings]

newstr.replace(str, '', 1) will only replace the first occurance of str1. Thanks to @ev-kounis for suggesting it

Answer 3

MyList = ["xxx-56", "xxx-57", "xxx-58"]
MyList = [x[len(prefix):] for x in MyList] # for each x in the list, 
                                 # this function will return x[len(prefix):] 
                                 # which is the string x minus the length of the prefix string

print(MyList)

---> ['56', '57', '58']

Answer 4

I would have done...

common = "Hello_"
lines = ["Hello_1 !", "Hello_2 !", "Hello_3 !"]

new_lines = []
for line in lines:
    # Finding first occurrence of the word we want to remove.
    startIndex = line.find(common) + len(common)
    new_lines.append(line[startIndex:])

print new_lines

Just testing performance with Jean-François Fabre since we're at it :

from timeit import timeit
import os

def test_fabre(lines):
    # import os

    commonprefix = os.path.commonprefix(lines)
    return [x[len(commonprefix):] for x in lines]

def test_insert(common, lines):
    new_lines = []
    for line in lines:
        startIndex = line.find(common) + len(common)
        new_lines.append(line[startIndex:])
    return new_lines

print timeit("test_insert(common, lines)", 'from __main__ import test_insert; common="Hello_";lines = ["Hello_1 !", "Hello_2 !", "Hello_3 !"]')
print timeit("test_fabre(lines)", 'from __main__ import test_fabre; lines = ["Hello_1 !", "Hello_2 !", "Hello_3 !"]')

# test_insert outputs : 2.92963575145
# test_fabre outputs : 4.23027790484 (with import os OUTside func)
# test_fabre outputs : 5.86552750264 (with import os INside func)

Answer 5

str1 = "hello"
list1 = ["hello1", "hello2", "hello3"]
list2 = []
for i in list1:
    list2.append(i.replace(str1,""))
print list2

this is the easiest way you can do.

What is the best way to get rid of common substring prefix in Python3?

Question

5 answers

solution1
5 ACCPTED 2018-01-17 14:46:41

solution2
2 2018-01-17 14:43:26

solution3
2 2018-01-17 14:45:20

solution4
2 2018-01-17 15:01:25

solution5
0 2018-01-17 14:46:26

What is the best way to get rid of common substring prefix in Python3?

Question

5 answers

solution1 5 ACCPTED 2018-01-17 14:46:41

solution2 2 2018-01-17 14:43:26

solution3 2 2018-01-17 14:45:20

solution4 2 2018-01-17 15:01:25

solution5 0 2018-01-17 14:46:26

solution1
5 ACCPTED 2018-01-17 14:46:41

solution2
2 2018-01-17 14:43:26

solution3
2 2018-01-17 14:45:20

solution4
2 2018-01-17 15:01:25

solution5
0 2018-01-17 14:46:26