简体繁体中英

Horizontal and Vertical Markovization

原文 2012-11-27 09:19:22 0 1 python/ nlp/ context-free-grammar

I have a sentence along with the grammar in a tree form. I need to train a Probabilistic Context Free Grammar from it so that I can give the best possible parse for it. I am using Viterbi CKY algorithm to get the best parse. The sentences are in the following tree format: (TOP (S (NP (DT The) (NN flight)) (VP (MD should) (VP (VB be) (NP (NP (CD eleven) (RB am)) (NP (NN tomorrow)))))) (PUNC .))

I have built a system which from the ATIS section of the Penn Treebank has learnt a probabilistic grammar and now can give a possible parse output for the above sentence.

I read about Horizontal and Vertical Markovization techniques which can help increase the accuracy by using annotations. I am a little confused as to how they work. Can someone guide me to some explanatory examples or illustrate how they work and how they effect the accuracy.

1 answers

It is worth looking at this paper by Klein and Manning:

http://nlp.stanford.edu/~manning/papers/unlexicalized-parsing.pdf

Vertical Markovization is a technique that provides context for a given rule. From the above paper:

For example, subject NP expansions are very different from object NP expansions: a subject NP is 8.7 times more likely than an object NP to expand as just a pronoun. Having separate symbols for subject and object NPs allows this variation to be captured and used to improve parse scoring. One way of capturing this kind of external context is to use parent annotation, as presented in Johnson (1998). For example, NPs with S parents (like subjects) will be marked NPˆS, while NPs with VP parents (like objects) will be NPˆVP.

By rewriting these rules with this additional parent annotation, we are adding information about the location of the rule that you are rewriting, and this additional information provides a more accurate probability of a particular rule rewrite.

The implementation of this is quite simple. Using the training data, start at the bottom non-terminals (these are the rules that rewrite to terminals such as DT, NNP, NN, VB, etc.) and append a ^ followed by its parent non-terminal. In your example, the first rewrite would be NP^S, and so on. Continue up the tree until you have reached the TOP (which you would not rewrite). In your case, the final rewrite would be S^TOP. Stripping the tags on your output will give you the final parse tree.

As for Horizontal Markovization, see this thread for a nice discussion: Horizontal Markovization .

PYQT: Horizontal and Vertical Headers

TreeMap In Python - Vertical and Horizontal

Matrix flip horizontal or vertical

Problems with horizontal and vertical scrolling

Multiple Horizontal & Vertical Lines are not showing

Vertical and horizontal projection of curves in OpenCV

Vertical and Horizontal figures on one plot

How to reshape a this list vertical to horizontal

Detect horizontal and vertical text with Tesseract

Graphvis - Horizontal instead of Vertical Hierarchy?

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question PYQT: Horizontal and Vertical Headers TreeMap In Python - Vertical and Horizontal Matrix flip horizontal or vertical Problems with horizontal and vertical scrolling Multiple Horizontal & Vertical Lines are not showing Vertical and horizontal projection of curves in OpenCV Vertical and Horizontal figures on one plot How to reshape a this list vertical to horizontal Detect horizontal and vertical text with Tesseract Graphvis - Horizontal instead of Vertical Hierarchy?

Related Tags

Horizontal and Vertical Markovization

Question

1 answers

solution1 3 2013-03-21 09:10:51

solution1
3 2013-03-21 09:10:51