[英]What is the difference between parsing and Part Of Speech Tagging?
I know that POS tagging labels each and every word in a sentence with its appropriate Part Of Speech, But isn't that what a Parser does too?我知道 POS 标记用适当的词性标记句子中的每个单词,但是解析器不也是这样做的吗? ie, break a sentence into its component parts?
即,将一个句子分成其组成部分? I've looked this up on the internet but couldn't find any satisfactory explanation.
我在网上查到了这个,但找不到任何令人满意的解释。 Please clear my doubt.
请清除我的疑问。 Thanks in advance
提前致谢
They are two distinct procedures:它们是两个不同的过程:
POS Tagging: each token gets assigned a label which reflects its word class. POS 标记:每个令牌都被分配了一个 label,它反映了它的单词 class。
Parsing: each sentence gets assigned a structure (often a tree) which reflects how its components are related to each other.解析:每个句子都被分配一个结构(通常是一棵树),它反映了它的组成部分是如何相互关联的。
POS Tagging takes a tokenised sequence of words, and returns a list of annotated tokens, where each token has a word class label. POS 标记采用标记化的单词序列,并返回带注释的标记列表,其中每个标记都有一个单词 class label。 This is often disambiguated by looking at the context surrounding the token.
这通常通过查看令牌周围的上下文来消除歧义。
There is also chunking , which groups tokens into related groups (such as noun phrases).还有chunking ,它将标记分组到相关的组中(例如名词短语)。 Chunks are non-overlapping sequences.
块是不重叠的序列。
Parsing commonly results in a parse tree for a sentence;解析通常会产生一个句子的解析树; often there can be many possible trees in case of ambiguous sentences.
在模棱两可的句子的情况下,通常可能有许多可能的树。
POS tagging is usually a preparatory step in parsing, as a parser typically operates on word classes (though there are some parsing algorithms that work with tokens directly, or a mixture of tags and tokens).词性标注通常是解析中的一个准备步骤,因为解析器通常对词类进行操作(尽管有一些解析算法可以直接使用标记,或者混合使用标记和标记)。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.