简体   繁体   English

除了命名实体识别之外,还有其他方法可以从句子中提取事件名称吗?

[英]Are there any alternate ways other than Named Entity Recognition to extract event names from sentences?

I'm a newbie to NLP and I'm working on NER using OpenNLP.我是 NLP 的新手,我正在使用 OpenNLP 研究 NER。 I have a sentence like " We have a dinner party today ".我有一句话像“我们今天有一个晚宴”。 Here "dinner party" is an event type.这里的“晚宴”是一种事件类型。 Similarly consider this sentence- "we have a room reservation" here room reservation is an event type.同样考虑这句话——“我们有房间预订”,这里的房间预订是一种事件类型。 My goal is to extract such words from sentences and label it as "Event_types" as the final output. This can be fairly achieved by creating custom NER model's by annotating sentences with proper tags in the training dataset.我的目标是从句子中提取这样的词,label 作为“Event_types”作为最终的 output。这可以通过在训练数据集中用适当的标签注释句子来创建自定义 NER 模型来公平地实现。 But the event types can be heterogeneous and random and hence it is very hard to label all possible patterns(ie. event types can be anything like "security meeting", "family function","parents teachers meeting", etc,etc,...).但是事件类型可能是异构和随机的,因此很难 label 所有可能的模式(即事件类型可以是“安全会议”、“家庭活动”、“家长教师会议”等任何事物。 ..). So I'm looking for an alternate way to achieve this problem... Immediate response would be appreciated.所以我正在寻找另一种方法来解决这个问题......将不胜感激。 Thanks: :)谢谢: :)

Basically you have two options: 1) A list-based approach where you have lists of entities you will extract from text.基本上你有两个选择:1)基于列表的方法,你有将从文本中提取的实体列表。 To solve the heterogeneous language use, one can train an embedding (eg Word2Vec or FastText) to identify contextually similar phrases for your list.为了解决异构语言的使用,可以训练嵌入(例如 Word2Vec 或 FastText)来为您的列表识别上下文相似的短语。 2) Train a custom CRF with data you have annotated (this obviously requires that you annotate bunch of sentences with corresponding tags). 2) 用你标注的数据训练自定义 CRF(这显然需要你用相应的标签标注一堆句子)。 I guess the ideal solution really depends on the data and people's willingness to annotate it.我想理想的解决方案实际上取决于数据和人们对其进行注释的意愿。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM