简体   繁体   English

循环查找列表中递减的整个单词字符串

[英]Loop to find decrementing whole word string in a list

I have a string variable that is a person's first and middle names, which have been accidentally concatenated. 我有一个字符串变量,它是一个人的名字和中间名,它们被意外地连接在一起。 Let's call it firstMiddle="johnadam" 我们称之为firstMiddle="johnadam"

I need to identify what's the first name and what isn't, and then split them into different variables. 我需要确定什么是名字,什么不是,然后将其拆分为不同的变量。 So I have this big text file full of first names, and the idea is that you check the full firstMiddle string to see if it's in the list, and if it isn't, then you decrement by one character and retry. 因此,我有一个大文件名的大文件,其想法是检查完整的firstMiddle字符串以查看它是否在列表中,如果不是,则递减一个字符firstMiddle试。 (if you increment you fail, eg "max" from "maxinea" ) (如果增加,则会失败,例如"maxinea" "max" "maxinea"

I have tried writing this a hundred different ways, and my problem seems to be that I can't get it to x in ya whole word (this \\b regex stuff only works on actual strings and not string variables?). 我尝试用一​​百种不同的方式编写这种代码,但我的问题似乎是我无法在整个单词中将它转换为x(此\\ b regex内容仅适用于实际字符串,而不适用于字符串变量?)。 The best outcome I had decremented "johnadam" down to "johna" because there is a name "johnathan" in the list. 我将"johnadam"递减为"johna"的最佳结果,因为列表中有一个名字"johnathan" Now I can't even remember how I did that and my current code decrements just once and then quits even though nameToMatch in nameList == False . 现在,我什至不记得我是怎么做到的,即使nameToMatch nameList == False nameToMatch ,我当前的代码也只递减一次然后退出。

I'm a total noob. 我真是个菜鸟。 I know I'm doing something very obviously stupid. 我知道我做的事情显然很愚蠢。 Please help. 请帮忙。 Here's some code: 这是一些代码:

firstMiddle = "johnadam"
nameToCheck = firstMiddle

for match in nameList:
    if nameToCheck not in nameList:
        nameToCheck = nameToCheck[:-1]
        break

firstName = nameToCheck
middleName = firstMiddle.partition(nameToCheck)[2]
firstMiddle = "johnadam"
nameToCheck = firstMiddle
nameList = ['johnathan', 'john', 'kate', 'sam']

while nameToCheck not in nameList:
    nameToCheck = nameToCheck[:-1]
firstname = nameToCheck
middleName = firstMiddle[ len(firstName): ]

This is a simple change from what Gabriel has done. 与Gabriel所做的相比,这是一个简单的变化。 The concept is basically the same. 概念基本相同。 This one just looks at the longest match rather than the first match. 这只是看最长的比赛,而不是第一个比赛。 Its difficult to put the entire code in the comment section so answering separately. 将整个代码放在注释部分很困难,因此需要单独回答。

firstMiddle = "johnadam"
nameToCheck = firstMiddle
nameList = ['johnathan', 'john', 'kate', 'sam']

firstNames = filter(lambda m: firstMiddle.startswith(m), nameList)
middleName = ''
if firstNames: # if the list isnt empty
    firstName = sorted( firstNames, key=len )[-1]
    middleName = firstMiddle.partition(firstName)[2]
else: 
    firstName = firstMiddle


print firstName

See if this works ... 看看是否可行...

You could do this in a brute force way. 您可以用蛮力方式做到这一点。

Iterate over the mixedname and slice it every time a bit more. 遍历mixedname并对其进行切片。

So you get 所以你得到

['johnadam']
['johnada','m']
['johnad','am']
['johna','dam']

['john','adam'] # your names in the list

If one of them match you put them aside and keep doing it until all of them are match. 如果其中之一匹配,则将它们放在一旁并继续操作直到所有匹配。

If you have names that start same like 'john' and 'johnathan' or are in middle of other names, like 'natham' and 'johnathan' you should not stop when you find a match, but keep doing the slicing, so you get [' john ','athan'] and ['joh',' nathan '] 如果您的名字以“ john”和“ johnathan”开头,或者在其他名字的中间,例如“ natham”和“ johnathan”,则在找到匹配项时不要停下来,而是继续进行切片,这样您就可以[' john ','athan']和['joh',' nathan ']

mixnames = ['johnadam','johnathan','johnamax']
names = ['john','adam', 'johnathan','nathan','max']

foundnames = []
for name in mixnames:
    for i in xrange(len(name)):
        name1 = name[0:i+1]
        name2 = name[i:]
        if name1 in names and name1 not in foundnames:
            foundnames.append(name1)
        if name2 in names and name2 not in foundnames:
            foundnames.append(name2)

print foundnames

output: 输出:

['john', 'adam', 'johnathan', 'nathan', 'max']

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM