I have a string with a boolean logic representation and I want to extract positive and negative words from it. Negative words are the ones preceded by not
.
Example 1 -
Input -
(A and B and C) and not (E or F)
Output -
positive - A & B & C
nagative - E | F
Example 2 -
(A and B) and not E and C
Output -
positive - A & B & C
nagative - E
I think I can do this by flattening the string out, as in -
(A and B and C) and not (E or F)
becomes A and B and C and not E or not F
and then use regex to extract positive and negative word, but not sure how to do it.
What would be the best way to do this?
I apologize for not being able to provide a general solution. While framing the solution I made the following assumptions:
Code:
import regex
def get_positive_and_negative_statements(logical_string):
output = dict()
positives = regex.findall(r"(?<!not\s+)(\((?:[A-Z]+\s+(?:and|or)\s+)+[A-Z]+\)|(?<!\([^\)]*)[A-Z]+)", logical_string)
output["positive"] = " & ".join([element.replace(" and ", " & ").replace(" or ", " | ") for element in positives])
negatives = regex.findall(r"(?<=not\s+)(\((?:[A-Z]+\s+(?:and|or)\s+)+[A-Z]+\)|(?<!\([^\)]*)[A-Z]+)", logical_string)
output["negative"] = " & ".join([element.replace(" and ", " & ").replace(" or ", " | ") for element in negatives])
return output
str1 = "(A and B and C) and not (E or F)"
str2 = "(A and B) and not E and C"
output1 = get_positive_and_negative_statements(str1)
output2 = get_positive_and_negative_statements(str2)
print(f"In {str1} ---\n{output1}")
print(f"\nIn {str2} ---\n{output2}")
Output:
In (A and B and C) and not (E or F) ---
{'positive': '(A & B & C)', 'negative': '(E | F)'}
In (A and B) and not E and C ---
{'positive': '(A & B) & C', 'negative': 'E'}
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.