简体   繁体   中英

Complex(repeating) rule using Spacy Pattern Matcher

I want to match a repeating pattern using spaCy's pattern matcher. Following is the pattern that i want to match: My account number is: 2893-26492-634-0924-63. Some more text here. My account number is: 2893-26492-634-0924-63. Some more text here. Basically, trying to match the following regex: \\d+(-\\d+)*

matcher = Matcher(nlp.vocab)
matcher.add('NUMBER_MERGE', None, [ {'IS_DIGIT': True}, {'IS_PUNCT': True}, {'IS_DIGIT': True}, {'IS_SPACE':True}])

This matches 342-234 Text , however fails for 342-234-958 Text .

I did not find any documentation to apply quantifiers on a set of operators. Any help would be appreciated.

您可以直接使用正则表达式作为模式。

matcher.add('NUMBER_MERGE', None, [{"TEXT": {"REGEX": "\d+(-\d+)*"}}])

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM