繁体   English   中英

从 python 中的字符串中过滤正负词

[英]Filter positive and negative words from a string in python

我有一个带有 boolean 逻辑表示的字符串,我想从中提取正面和负面的词。 否定词是前面有not的词。

示例 1 -

Input - 

(A and B and C) and not (E or F)

Output - 

positive - A & B & C
nagative - E | F

示例 2 -

(A and B) and not E and C

Output - 

positive - A & B & C
nagative - E

我想我可以通过拉平绳子来做到这一点,例如 -

(A and B and C) and not (E or F)变成A and B and C and not E or not F然后使用正则表达式提取正负词,但不知道该怎么做。

最好的方法是什么?

对于无法提供通用解决方案,我深表歉意。 在制定解决方案时,我做了以下假设:

  1. 没有大括号的层次结构,例如, A and (B and (C or (D and E)))是不允许的。
  2. 最低级别的连接运算符是'and',例如, (something) and (something) and...

代码:

import regex

def get_positive_and_negative_statements(logical_string):
    output = dict()
    positives = regex.findall(r"(?<!not\s+)(\((?:[A-Z]+\s+(?:and|or)\s+)+[A-Z]+\)|(?<!\([^\)]*)[A-Z]+)", logical_string)
    output["positive"] = " & ".join([element.replace(" and ", " & ").replace(" or ", " | ") for element in positives])
    negatives = regex.findall(r"(?<=not\s+)(\((?:[A-Z]+\s+(?:and|or)\s+)+[A-Z]+\)|(?<!\([^\)]*)[A-Z]+)", logical_string)
    output["negative"] = " & ".join([element.replace(" and ", " & ").replace(" or ", " | ") for element in negatives])
    return output

str1 = "(A and B and C) and not (E or F)"
str2 = "(A and B) and not E and C"
output1 = get_positive_and_negative_statements(str1)
output2 = get_positive_and_negative_statements(str2)

print(f"In {str1} ---\n{output1}")
print(f"\nIn {str2} ---\n{output2}")

Output:

In (A and B and C) and not (E or F) ---
{'positive': '(A & B & C)', 'negative': '(E | F)'}

In (A and B) and not E and C ---
{'positive': '(A & B) & C', 'negative': 'E'}

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM