简体繁体中英

What are the ways of Key-Value extraction from unstructured text?

原文 2019-05-31 07:55:10 3 1 regex/ machine-learning/ nlp/ information-extraction/ ner

I'm trying to figure out what are the ways (and which of them the best one) of extraction of Values for predefined Keys in the unstructured text?

Input:

The doctor prescribed me a drug called favipiravir.
His name is Yury.
Ilya has already told me about that.
The weather is cold today.
I am taking a medicine called nazivin.

Key list: ['drug', 'name', 'weather']

Output:

['drug=favipiravir', 'drug=nazivin', 'name=Yury', 'weather=cold']

So, as you can see, in the 3d sentence there is no explicit key 'name' and therefore no value extracted (I think there is the difference with NER). At the same time, 'drug' and 'medicine' are synonyms and we should treat 'medicine' as 'drug' key and extract the value also.

And the next question, what if the key set will be mutable? Should I use as a base regexp approach because of predefined Keys or there is a way to implement it with supervised learning/NN? (but in this case how to deal with mutable keys?)

1 answers

You can use a parser to tag wards. Your problem is similar to Named Entity Recognition. A lot of libraries have POS taggers available. You can try those. They are generally trained to identify names, locations, etc. Depending on the type of words you need, you may need to train the parser. So you'll need some labeled data also.
Check out this link: https://nlp.stanford.edu/software/CRF-NER.html

Extracting key-value pairs from multiline text in java

Extracting string value from unstructured text

Substitution substrings from a key-value table

Column separated key-value text with possible multiline strings and key-value substrings

How to get a value from a text document that has an unstructured table

Elegant parsing of text-based key-value list

How to match text between as key-value pairs

Create a regex with target text having varying key-value pairs?

Extract value from a list of key-value pairs using grep

Extract a substring from value of key-value pair using regex

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question Extracting key-value pairs from multiline text in java Extracting string value from unstructured text Substitution substrings from a key-value table Column separated key-value text with possible multiline strings and key-value substrings How to get a value from a text document that has an unstructured table Elegant parsing of text-based key-value list How to match text between as key-value pairs Create a regex with target text having varying key-value pairs? Extract value from a list of key-value pairs using grep Extract a substring from value of key-value pair using regex

Related Tags

What are the ways of Key-Value extraction from unstructured text?

Question

1 answers

solution1 1 2019-05-31 15:01:25

solution1
1 2019-05-31 15:01:25