简体   繁体   English

Python:将嵌套的dict /列表与字符串列表匹配

[英]Python: Matching nested dict/lists with a list of strings

I have a list of titles as strings. 我有一个标题列表作为字符串。 I want to identify any of these titles within which any one of a list of a specific few keywords is found (eg 'new excellent iphone 4 8gb' would match with ['4', '8gb']). 我想确定在其中找到几个特定关键字列表中的任何一个的所有标题(例如,“新的出色iphone 4 8gb”将与['4','8gb']匹配)。 All of the keywords within these keywords sets must be in the title string to count as a match (ie 'iphone 4' would not match with ['4', '8gb']) - and they should be separate words, ie i don't want ['4', '8gb'] to match with 'iphone 4s 8gb'. 这些关键字集中的所有关键字都必须在标题字符串中才能算作匹配项(即“ iphone 4”与['4','8gb']不匹配),并且它们应该是单独的单词,即我不'不想让['4','8gb']与'iphone 4s 8gb'匹配。 I have these keyword sets in dicts nested inside a list. 我在列表中嵌套的字典中有这些关键字集。

My code is below although it is missing a key part, the loop through each of the lists of keywords, which I'm having trouble wrapping my head around. 我的代码在下面,尽管它缺少一个关键部分,即遍历每个关键字列表的循环,但我很难绕开它。 What is the most efficient way, performance wise of writing this to function? 将其编写为功能的最有效方式是性能?

cleantitles = ['title1','title2','title3']
models = [{'model': ['4', '8gb'], 'mapped': u'iphone 4 8gb'}, {'model': ['4', '16gb'], 'mapped': u'iphone 4 16gb'}]

for title in cleantitles:
    if all(x in title for x in ???):
        print 'matched something!'
    else:
        print 'no match:('  

Trying to fit this sort of statement in one line is likely to make it unreadable. 试图将这样的语句放在一行中可能会使它难以阅读。 You need 3 layers of iteration: title, model, keywords. 您需要3层迭代:标题,模型,关键字。 You're currently trying to combine model and keywords into one statement. 您目前正在尝试将模型和关键字合并为一个语句。 I would recommend you avoid this. 我建议您避免这种情况。

You're also missing the unpacking of the dictionary using [key] , 您还缺少使用[key]打开字典的包装,

You'll want something like this: 您会想要这样的东西:

for title in cleantitles:
    for model in models:
        if all(x in title for x in model['model']):
            print('matched something!')
        else:
            print('no match:(')

Never prematurely optimise your code. 切勿过早优化代码。 Write the code in the simplest way you can, then if this isn't fast enough for your situation, refactor. 以最简单的方式编写代码,然后,如果这种方法不够快速,请进行重构。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM