Say I have strings,
string1 = 'Hello how are you'
string2 = 'are you doing now?'
The result should be something like
Hello how are you doing now?
I was thinking different ways using re
and string search. ( Longest common substring problem )
But is there any simple way (or library) that does this in python?
To make things clear i'll add one more set of test strings!
string1 = 'This is a nice ACADEMY'
string2 = 'DEMY you know!'
the result would be!,
'This is a nice ACADEMY you know!'
This should do:
string1 = 'Hello how are you'
string2 = 'are you doing now?'
i = 0
while not string2.startswith(string1[i:]):
i += 1
sFinal = string1[:i] + string2
OUTPUT :
>>> sFinal
'Hello how are you doing now?'
or, make it a function so that you can use it again without rewriting:
def merge(s1, s2):
i = 0
while not s2.startswith(s1[i:]):
i += 1
return s1[:i] + s2
OUTPUT :
>>> merge('Hello how are you', 'are you doing now?')
'Hello how are you doing now?'
>>> merge("This is a nice ACADEMY", "DEMY you know!")
'This is a nice ACADEMY you know!'
This should do what you want:
def overlap_concat(s1, s2):
l = min(len(s1), len(s2))
for i in range(l, 0, -1):
if s1.endswith(s2[:i]):
return s1 + s2[i:]
return s1 + s2
Examples:
>>> overlap_concat("Hello how are you", "are you doing now?")
'Hello how are you doing now?'
>>>
>>> overlap_concat("This is a nice ACADEMY", "DEMY you know!")
'This is a nice ACADEMY you know!'
>>>
Using str.endswith
and enumerate
:
def overlap(string1, string2):
for i, s in enumerate(string2, 1):
if string1.endswith(string2[:i]):
break
return string1 + string2[i:]
>>> overlap("Hello how are you", "are you doing now?")
'Hello how are you doing now?'
>>> overlap("This is a nice ACADEMY", "DEMY you know!")
'This is a nice ACADEMY you know!'
If you were to account for trailing special characters, you'd be wanting to employ some re
based substitution.
import re
string1 = re.sub('[^\w\s]', '', string1)
Although note that this would remove all special characters in the first string.
A modification to the above function which will find the longest matching substring (instead of the shortest) involves traversing string2
in reverse.
def overlap(string1, string2):
for i in range(len(s)):
if string1.endswith(string2[:len(string2) - i]):
break
return string1 + string2[len(string2) - i:]
>>> overlap('Where did', 'did you go?')
'Where did you go?'
Other answers were great guys but it did fail for this input.
string1 = 'THE ACADEMY has'
string2= '.CADEMY has taken'
output:
>>> merge(string1,string2)
'THE ACADEMY has.CADEMY has taken'
>>> overlap(string1,string2)
'THE ACADEMY has'
However there's this standard library difflib
which proved to be effective in my case!
match = SequenceMatcher(None, string1,\
string2).find_longest_match\
(0, len(string1), 0, len(string2))
print(match) # -> Match(a=0, b=15, size=9)
print(string1[: match.a + match.size]+string2[match.b + match.size:])
output:
Match(a=5, b=1, size=10)
THE ACADEMY has taken
which words you want to replace are appearing in the second string so you can try something like :
new_string=[string2.split()]
new=[]
new1=[j for item in new_string for j in item if j not in string1]
new1.insert(0,string1)
print(" ".join(new1))
with the first test case:
string1 = 'Hello how are you'
string2 = 'are you doing now?'
output:
Hello how are you doing now?
second test case:
string1 = 'This is a nice ACADEMY'
string2 = 'DEMY you know!'
output:
This is a nice ACADEMY you know!
Explanation :
first, we are splitting the second string so we can find which words we have to remove or replace :
new_string=[string2.split()]
second step we will check each word of this splitter string with string1 , if any word is in that string than choose only first string word , leave that word in second string :
new1=[j for item in new_string for j in item if j not in string1]
This list comprehension is same as :
new1=[]
for item in new_string:
for j in item:
if j not in string1:
new1.append(j)
last step combines both string and join the list:
new1.insert(0,string1)
print(" ".join(new1))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.