简体   繁体   中英

Filter positive and negative words from a string in python

I have a string with a boolean logic representation and I want to extract positive and negative words from it. Negative words are the ones preceded by not .

Example 1 -

Input - 

(A and B and C) and not (E or F)

Output - 

positive - A & B & C
nagative - E | F

Example 2 -

(A and B) and not E and C

Output - 

positive - A & B & C
nagative - E

I think I can do this by flattening the string out, as in -

(A and B and C) and not (E or F) becomes A and B and C and not E or not F and then use regex to extract positive and negative word, but not sure how to do it.

What would be the best way to do this?

I apologize for not being able to provide a general solution. While framing the solution I made the following assumptions:

  1. There is no hierarchy of braces, eg, A and (B and (C or (D and E))) is not allowed.
  2. The lowest level join operator is 'and', eg, (something) and (something) and...

Code:

import regex

def get_positive_and_negative_statements(logical_string):
    output = dict()
    positives = regex.findall(r"(?<!not\s+)(\((?:[A-Z]+\s+(?:and|or)\s+)+[A-Z]+\)|(?<!\([^\)]*)[A-Z]+)", logical_string)
    output["positive"] = " & ".join([element.replace(" and ", " & ").replace(" or ", " | ") for element in positives])
    negatives = regex.findall(r"(?<=not\s+)(\((?:[A-Z]+\s+(?:and|or)\s+)+[A-Z]+\)|(?<!\([^\)]*)[A-Z]+)", logical_string)
    output["negative"] = " & ".join([element.replace(" and ", " & ").replace(" or ", " | ") for element in negatives])
    return output

str1 = "(A and B and C) and not (E or F)"
str2 = "(A and B) and not E and C"
output1 = get_positive_and_negative_statements(str1)
output2 = get_positive_and_negative_statements(str2)

print(f"In {str1} ---\n{output1}")
print(f"\nIn {str2} ---\n{output2}")

Output:

In (A and B and C) and not (E or F) ---
{'positive': '(A & B & C)', 'negative': '(E | F)'}

In (A and B) and not E and C ---
{'positive': '(A & B) & C', 'negative': 'E'}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM