简体   繁体   English

使用文本特征提取创建数据集

[英]dataset creation using feature extraction from text

I am trying to extract a few features from text data of terrorist events to create a dataset.我试图从恐怖事件的文本数据中提取一些特征来创建一个数据集。 Using name entity recognition, I have successfully extracted the features like name, place, organization now I want to extract the number of members involved in the incidence.使用名称实体识别,我已经成功提取了名称,地点,组织等特征现在我想提取参与事件的成员数量。

The 2008 Mumbai attacks (also referred to as 26/11) were a series of terrorist attacks that took place in
November 2008, when 10 members of Lashkar-e-Taiba, a terrorist organization based in Pakistan,
carried out 12 coordinated shooting and bombing attacks lasting four days across Mumbai.

from the above text how can I extract 10 members of Lashkar-e-Taiba and place 10 in the column of the number of attackers.从上面的文本中,我如何提取Lashkar-e-Taiba 的 10 名成员并将 10 放在攻击者人数列中。 Is that even possible using nlp techniques?甚至可以使用 nlp 技术吗?

The two techniques that could be useful in your case are - dependency parsing and semantic role labeling.在您的情况下可能有用的两种技术是 - 依赖解析和语义角色标记。 You may also want to look up aspect based sentiment analysis.您可能还想查找基于方面的情绪分析。 All three of these can help identify relationships between words in a sentence.所有这三个都可以帮助识别句子中单词之间的关系。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM