简体   繁体   中英

Loop to find decrementing whole word string in a list

I have a string variable that is a person's first and middle names, which have been accidentally concatenated. Let's call it firstMiddle="johnadam"

I need to identify what's the first name and what isn't, and then split them into different variables. So I have this big text file full of first names, and the idea is that you check the full firstMiddle string to see if it's in the list, and if it isn't, then you decrement by one character and retry. (if you increment you fail, eg "max" from "maxinea" )

I have tried writing this a hundred different ways, and my problem seems to be that I can't get it to x in ya whole word (this \\b regex stuff only works on actual strings and not string variables?). The best outcome I had decremented "johnadam" down to "johna" because there is a name "johnathan" in the list. Now I can't even remember how I did that and my current code decrements just once and then quits even though nameToMatch in nameList == False .

I'm a total noob. I know I'm doing something very obviously stupid. Please help. Here's some code:

firstMiddle = "johnadam"
nameToCheck = firstMiddle

for match in nameList:
    if nameToCheck not in nameList:
        nameToCheck = nameToCheck[:-1]
        break

firstName = nameToCheck
middleName = firstMiddle.partition(nameToCheck)[2]
firstMiddle = "johnadam"
nameToCheck = firstMiddle
nameList = ['johnathan', 'john', 'kate', 'sam']

while nameToCheck not in nameList:
    nameToCheck = nameToCheck[:-1]
firstname = nameToCheck
middleName = firstMiddle[ len(firstName): ]

This is a simple change from what Gabriel has done. The concept is basically the same. This one just looks at the longest match rather than the first match. Its difficult to put the entire code in the comment section so answering separately.

firstMiddle = "johnadam"
nameToCheck = firstMiddle
nameList = ['johnathan', 'john', 'kate', 'sam']

firstNames = filter(lambda m: firstMiddle.startswith(m), nameList)
middleName = ''
if firstNames: # if the list isnt empty
    firstName = sorted( firstNames, key=len )[-1]
    middleName = firstMiddle.partition(firstName)[2]
else: 
    firstName = firstMiddle


print firstName

See if this works ...

You could do this in a brute force way.

Iterate over the mixedname and slice it every time a bit more.

So you get

['johnadam']
['johnada','m']
['johnad','am']
['johna','dam']

['john','adam'] # your names in the list

If one of them match you put them aside and keep doing it until all of them are match.

If you have names that start same like 'john' and 'johnathan' or are in middle of other names, like 'natham' and 'johnathan' you should not stop when you find a match, but keep doing the slicing, so you get [' john ','athan'] and ['joh',' nathan ']

mixnames = ['johnadam','johnathan','johnamax']
names = ['john','adam', 'johnathan','nathan','max']

foundnames = []
for name in mixnames:
    for i in xrange(len(name)):
        name1 = name[0:i+1]
        name2 = name[i:]
        if name1 in names and name1 not in foundnames:
            foundnames.append(name1)
        if name2 in names and name2 not in foundnames:
            foundnames.append(name2)

print foundnames

output:

['john', 'adam', 'johnathan', 'nathan', 'max']

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM