Spacy依赖匹配器模式不返回匹配项

Question

I am trying to create, add and get results from a pattern using spacy DependencyMatcher.我正在尝试使用 spacy DependencyMatcher 从模式中创建、添加和获取结果。

I created a pattern for the sentence: "From Monday to Friday"我为句子创建了一个模式：“从星期一到星期五”

The full pattern:完整模式：

pattern = [
    {
        "RIGHT_ID": "node0",
        "RIGHT_ATTRS": {'DEP': 'ROOT', 'POS': 'ADP', 'TAG': 'IN'}
    },
    {
        "LEFT_ID": "node0",
        "REL_OP": ">",
        "RIGHT_ID": "node1",
        "RIGHT_ATTRS": {'DEP': 'pobj', 'POS': 'PROPN', 'TAG': 'NNP'},
    },
    {
        "LEFT_ID": "node1",
        "REL_OP": "$--",
        "RIGHT_ID": "node2",
        "RIGHT_ATTRS": {'DEP': 'prep', 'POS': 'ADP', 'TAG': 'IN'},
    },
       {
        "LEFT_ID": "node2",
        "REL_OP": ">",
        "RIGHT_ID": "node3",
        "RIGHT_ATTRS":{'DEP': 'pobj', 'POS': 'PROPN', 'TAG': 'NNP'},
    },
    
]

The simpler pattern is:更简单的模式是：

pattern = [
    {
        "RIGHT_ID": "node0",
        "RIGHT_ATTRS": {"POS": "ADP"}
    },
    {
        "LEFT_ID": "node0",
        "REL_OP": ">",
        "RIGHT_ID": "node1",
        "RIGHT_ATTRS": {"POS": "PROPN"},
    },
    {
        "LEFT_ID": "node1",
        "REL_OP": "$--",
        "RIGHT_ID": "node2",
        "RIGHT_ATTRS": {"POS": "ADP"},
    },
       {
        "LEFT_ID": "node2",
        "REL_OP": ">",
        "RIGHT_ID": "node3",
        "RIGHT_ATTRS":{'POS': 'PROPN'},
    },
    
]

My question is, why is this pattern not giving any matches, not on the full or simpler pattern?我的问题是，为什么这个模式没有给出任何匹配，而不是完整或更简单的模式？

import spacy
from spacy.matcher import DependencyMatcher


nlp = spacy.load("en_core_web_sm")
matcher = DependencyMatcher(nlp.vocab)


text="From monday to friday"
doc = nlp(text)
matcher.add("pattern1", [pattern])

matches = matcher(doc)

# Each token_id corresponds to one pattern dict
match_id, token_ids = matches[0]

spacy versions:空间版本：

spaCy v3.0.6 spaCy v3.0.6

NAME SPACY VERSION命名空间版本

en_core_web_sm >=3.0.0,<3.1.0 3.0.0 ✔ en_core_web_sm >=3.0.0,<3.1.0 3.0.0 ✔

Answer 1

Your REL_OP for node2 is backwards.您的node2的REL_OP是向后的。 It should be $++ .它应该是$++ 。

To give a full explanation, this code works for me.为了给出完整的解释，这段代码对我有用。

import spacy

from spacy.matcher import DependencyMatcher

nlp = spacy.load("en_core_web_sm")
matcher = DependencyMatcher(nlp.vocab)

text="From Monday to Friday"
doc = nlp(text)

pattern = [
    {
        "RIGHT_ID": "node0",
        "RIGHT_ATTRS": {'POS': 'ADP', 'TAG': 'IN'}
    },
    {
        "LEFT_ID": "node0",
        "REL_OP": ">",
        "RIGHT_ID": "node1",
        "RIGHT_ATTRS": {'POS': 'PROPN'},
    },
    {
        "LEFT_ID": "node1",
        "REL_OP": "$++",
        "RIGHT_ID": "node2",
        "RIGHT_ATTRS": {'POS': 'ADP'},
    },
       {
        "LEFT_ID": "node2",
        "REL_OP": ">",
        "RIGHT_ID": "node3",
        "RIGHT_ATTRS":{'POS': 'PROPN'},
    },
    
]

matcher.add("pattern1", [pattern])

matches = matcher(doc)
print(matches)

print("-----")
# this part is just for reference
for word in doc:
    print(word.pos_, word.tag_, word.dep_, word, sep="\t")

Couple of points about this:关于这一点的几点：

your second pattern is better, you shouldn't need to specify tag and pos for English (tag determines pos)您的第二种模式更好，您不需要为英语指定标签和位置（标签确定位置）
In the v3 small model "monday" and "friday" are not proper nouns unless capitalized (it looks like your displaCy output is from the public demo, which uses v2)在 v3 小 model 中，“星期一”和“星期五”不是专有名词，除非大写（看起来您的显示 output 来自公共演示，它使用 v2）

Spacy依赖匹配器模式不返回匹配项

问题描述

1 个解决方案

解决方案1
1 已采纳 2021-05-24 04:59:36

Spacy依赖匹配器模式不返回匹配项

问题描述

1 个解决方案

解决方案1 1 已采纳 2021-05-24 04:59:36

解决方案1
1 已采纳 2021-05-24 04:59:36