简体   繁体   English

简化许多if语句

[英]Simplifying many if-statements

Is there a way to simplify this pile of if-statements? 有没有一种方法可以简化这堆if语句? This parsing function sure works (with the right dictionaries), but it has to test 6 if-statements for each word in the input. 这个解析函数肯定可以工作(使用正确的字典),但是它必须为输入中的每个单词测试6个if语句。 For a 5-word sentence that would be 30 if-statements. 对于一个5字的句子,它将是30个if语句。 It is also kind of hard to read. 这也很难读。

def parse(text):
    predicate=False
    directObjectAdjective=False
    directObject=False
    preposition=False
    indirectObjectAdjective=False
    indirectObject=False
    text=text.casefold()
    text=text.split()
    for word in text:
        if not predicate:
            if word in predicateDict:
                predicate=predicateDict[word]
                continue
        if not directObjectAdjective:
            if word in adjectiveDict:
                directObjectAdjective=adjectiveDict[word]
                continue
        if not directObject:
            if word in objectDict:
                directObject=objectDict[word]
                continue
        if not preposition:
            if word in prepositionDict:
                preposition=prepositionDict[word]
                continue
        if not indirectObjectAdjective:
            if word in adjectiveDict:
                indirectObjectAdjective=adjectiveDict[word]
                continue
        if not indirectObject:
            if word in objectDict:
                indirectObject=objectDict[word]
                continue
    if not directObject and directObjectAdjective:
        directObject=directObjectAdjective
        directObjectAdjective=False
    if not indirectObject and indirectObjectAdjective:
        indirectObject=indirectObjectAdjective
        indirectObjectAdjective=False
    return [predicate,directObjectAdjective,directObject,preposition,indirectObjectAdjective,indirectObject]

Here's also a sample of a dictionary, if that's needed. 如果需要的话,这也是字典的样本。

predicateDict={
"grab":"take",
"pick":"take",
"collect":"take",
"acquire":"take",
"snag":"take",
"gather":"take",
"attain":"take",
"capture":"take",
"take":"take"}

This is more of a Code Review question than a Stack Overflow one. 与堆栈溢出问题相比,这更多的是代码审查问题。 A major issue is that you have similar data that you're keeping in separate variables. 一个主要的问题是,您拥有保存在单独变量中的相似数据。 If you combine your variables, then you can iterate over them. 如果合并变量,则可以对其进行迭代。

missing_parts_of_speech = ["predicate", [...]]
dict_look_up = {"predicate":predicateDict,
           [...]           
        }    
found_parts_of_speech = {}    
for word in text:
    for part in missing_parts_of_speech:
        if word in dict_look_up[part]:
            found_parts_of_speech[part] = dict_look_up[part][word]
            missing_parts_of_speech.remove(part)
            continue

I would suggest to simply use the method dict.get . 我建议只使用dict.get方法。 This method has the optional argument default . 此方法具有可选参数default By passing this argument you can avoid a KeyError . 通过传递此参数,可以避免KeyError If the key is not present in a dictionary, the default value will be returned. 如果字典中不存在该键,则将返回默认值。

If you use the previously assigned variable as default, it will not be replaced by an arbitrary value, but the correct value. 如果您使用先前分配的变量作为默认变量,则不会用任意值代替它,而是正确的值。 Eg, if the current word is a "predicate" the "direct object" will be replaced by the value that was already stored in the variable. 例如,如果当前单词是“谓词”,则“直接对象”将被已存储在变量中的值替换。


CODE

def parse(text):
    predicate = False
    directObjectAdjective = False
    directObject = False
    preposition = False
    indirectObjectAdjective = False
    indirectObject = False

    text=text.casefold()
    text=text.split()
    for word in text:
        predicate = predicateDict.get(word, predicate)
        directObjectAdjective = adjectiveDict.get(word, directObjectAdjective)
        directObject = objectDict.get(word, directObject)
        preposition = prepositionDict.get(word, preposition)
        indirectObjectAdjective = adjectiveDict.get(word, indirectObjectAdjective)
        indirectObject = objectDict.get(word, indirectObject)

    if not directObject and directObjectAdjective:
        directObject = directObjectAdjective
        directObjectAdjective = False

    if not indirectObject and indirectObjectAdjective:
        indirectObject = indirectObjectAdjective
        indirectObjectAdjective = False

    return [predicate, directObjectAdjective, directObject, preposition, indirectObjectAdjective, indirectObject]

PS: Use a little more spaces. PS:多留一些空间。 Readers will thank you... 读者将感谢您...


PPS: I have not tested this, for I do not have such dictionaries at hand. PPS:我没有测试过,因为我手头没有这样的词典。


PPPS: This will always return the last occurances of the types within the text, while your implementation will always return the first occurances. PPPS:这将始终返回文本中类型的最后出现 ,而您的实现将始终返回第一个出现

You could map the different kinds of words (as strings) to dictionaries where to find those words, and then just check which of those have not been found yet and look them up if they are in those dicts. 您可以将不同种类的单词(如字符串)映射到字典中,以在其中找到这些单词,然后仅检查尚未找到的那些单词,并查看它们是否在那些字典中。

needed = {"predicate": predicateDict,
          "directObjectAdjective": adjectiveDict,
          "directObject": objectDict,
          "preposition": prepositionDict,
          "indirectObjectAdjective": adjectiveDict,
          "indirectObject": objectDict}

for word in text:
    for kind in needed:
        if isinstance(needed[kind], dict) and word in needed[kind]:
            needed[kind] = needed[kind][word]
            continue

In the end (and in each step on the way) all the items in needed that do not have a dict as a value have been found and replaced by the value from their respective dict . 最后(在执行过程中的每个步骤中),找到了所有needed ,没有dict作为值的项目,并将其替换为各自dict的值。

(In retrospect, it might make more sense to ue two dictionaries, or one dict and a set: One for the final value for that kind of word, and one for whether they have already been found. Would probably be a bit easier to grasp.) (回想起来,使用两个字典,或一个字典和一组字典可能更有意义:一个字典用于该单词的最终值,另一个字典是否已经找到它们。可能会更容易理解。)

I suggest that you use a new pattern to write this code instead the old one. 我建议您使用一种新的模式来代替旧的代码来编写此代码。 The new pattern has 9 lines and stay 9 lines - just add more dictionaries to D. The old has already 11 lines and will grow 4 lines with every additional dictionaries to test. 新模式有9行,剩下9行-只需向D添加更多字典。旧模式已经有11行,并且将增加4行,每增加一个字典就可以测试一次。

aDict = { "a1" : "aa1", "a2" : "aa1" }
bDict = { "b1" : "bb1", "b2" : "bb2" }
text = ["a1", "b2", "a2", "b1"]
# old pattern
a = False
b = False
for word in text:
    if not a:
        if word in aDict:
            a = aDict[word]
            continue
    if not b:
        if word in bDict:
            b = bDict[word]
            continue
print(a, b)
# new pattern
D = [ aDict, bDict]
A = [ False for _ in D]
for word in text:
    for i, a in enumerate(A):
        if not a:
            if word in D[i]:
                A[i] = D[i][word]
                continue
print(A)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM