简体   繁体   English

我应该使用哪种神经网络从用于RDF规则的句子中提取关键信息?

[英]What kind of neural network should I use for extract key information from a sentence for RDF rules?

I am working on my paper, and one of the tasks is to extract the company name and location from the sentence of the following type: 我正在研究论文,任务之一是从以下类型的句子中提取公司名称和位置:

"Google shares resources with Japan based company." “ Google与日本公司共享资源。”

Here, I want the output to be "Google Japan". 在这里,我希望输出为“ Google Japan”。 The sentence structure may also be varied like "Japan based company can access the resources of Google". 句子结构也可能像“日本公司可以访问Google的资源”那样变化。 I have tried an Attention based NN, but the error rate is around 0.4. 我尝试了基于注意力的NN,但错误率约为0.4。 Can anyone give me a little bit of hint about which model I should use? 谁能给我一些关于应该使用哪种模型的提示?

And I printed out the validation process like this: validation print 然后我像这样打印出验证过程: 验证打印

And I got the graphs of the loss and accuracy: lass and accuracy 我得到损耗和精度的图表:损耗 和精度

It shows that the val_acc is 0.99. 它显示val_acc为0.99。 Is this mean my model is pretty good at predicting? 这是否意味着我的模型非常擅长预测? But why do I get 0.4 error rate when I use my own validation function to show error rate? 但是,当我使用自己的验证功能显示错误率时为什么会得到0.4的错误率呢? I am very new to ML. 我是ML的新手。 What does the val_acc actually mean? val_acc实际是什么意思?

Here is my model: 这是我的模型:

encoder_input = Input(shape=(INPUT_LENGTH,))
decoder_input = Input(shape=(OUTPUT_LENGTH,))

encoder = Embedding(input_dict_size, 64, input_length=INPUT_LENGTH, mask_zero=True)(encoder_input)
encoder = LSTM(64, return_sequences=True, unroll=True)(encoder)
encoder_last = encoder[:, -1, :]

decoder = Embedding(output_dict_size, 64, input_length=OUTPUT_LENGTH, mask_zero=True)(decoder_input)
decoder = LSTM(64, return_sequences=True, unroll=True)(decoder, initial_state=[encoder_last, encoder_last])

attention = dot([decoder, encoder], axes=[2, 2])
attention = Activation('softmax')(attention)

context = dot([attention, encoder], axes=[2, 1])
decoder_combined_context = concatenate([context, decoder])

output = TimeDistributed(Dense(64, activation="tanh"))(decoder_combined_context)  # equation (5) of the paper
output = TimeDistributed(Dense(output_dict_size, activation="softmax"))(output)

model = Model(inputs=[encoder_input, decoder_input], outputs=[output])
model.compile(optimizer='adam', loss="binary_crossentropy", metrics=['accuracy'])

es = EarlyStopping(monitor='val_loss', mode='min', verbose=1, patience=200, min_delta=0.0005)

I will preface this by saying that if you're new to ML, I would advise you to learn more "traditional" algorithms before turning towards Neural Networks. 首先,我要说的是如果您是ML的新手,我建议您在转向神经网络之前,学习更多的“传统”算法。 Furthermore, your task is so specific to company names and locations that using Latent Semantic Analysis (or a similar statistical method) to generate your embeddings and an SVM to determine which words are relevant might give you better results than neural networks with less experimentation and less training time. 此外,您的任务是如此特定于公司名称和位置,以至于使用潜在语义分析(或类似的统计方法)来生成嵌入内容,并使用SVM确定哪些词是相关的,这比神经网络具有更少的实验和更少的结果会更好。训练时间。

Now with all that being said, here's what I can gather. 现在说了这么多,这就是我可以收集的。 If I understand correctly, you have a separate, second validation set on which you get a 40% error rate. 如果我理解正确,那么您将获得一个单独的第二个验证集,该验证集上的错误率为40%。 All the numbers in the screenshots are pretty good, which leads me to two possible conclusions: Either your second validation set is very different from your first one and you're suffering from a bit of overfitting, or there's a bug somewhere in your code that leads Keras to believe your model is doing great when in fact it isn't. 屏幕截图中的所有数字都非常好,这使我得出两个可能的结论:您的第二个验证集与第一个验证集有很大不同,并且您正遭受过度拟合的困扰,或者代码中存在一个错误,导致Keras相信您的模型做得很好,而实际上却并非如此。 (Bear in mind I'm not very familiar with Keras so I don't know how likely the latter option is) (请记住,我对Keras不太熟悉,所以我不知道后一种选择的可能性)

Now as for the model itself, your task is clearly extractive , meaning that your model doesn't need to paraphrase anything or come up with something that isn't in the source text. 现在,对于模型本身,您的任务显然是提取性的 ,这意味着您的模型不需要解释任何内容或提出源文本中没有的内容。 Your model should take that into account, and should never make mistakes like confusing India with New-Zealand or Tecent with Google. 您的模型应该考虑到这一点,并且绝对不要犯错误,例如将印度与新西兰混淆或将腾讯与Google混淆。 You can probably base your model on recent work in extractive summarization , which is a fairly active field (moreso than keyword and keyphrase extraction). 您可能可以基于最近在提取摘要中的工作来建立模型,这是一个相当活跃的领域(除了关键字和关键词提取之外)。 Here 'sa recent article which uses a neural attention model, you can use Google Scholar to easily find more. 是一篇使用神经注意力模型的最新文章,您可以使用Google学术搜索轻松找到更多内容。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 我应该使用哪种神经网络来找到该数据集的方程式? - What kind of neural network should I use to to find the equation of this data set? 从句子中提取信息 - Extract information from sentence 我应该使用什么策略来定期从特定文件夹中提取信息 - What strategy should I use to periodically extract information from a specific folder 对于不同大小的输入,我应该使用哪种类型的神经网络? - What type of neural network should I be using for different sized inputs? 在 tensorflow 2 中使用回调信息训练神经网络 - Use callback information for training neural network in tensorflow 2 matplotlib-我应该使用哪种set_major_locator - matplotlib - what kind of set_major_locator should I use 我应该使用什么样的参数来查找和裁剪图像中的对象? - What kind of parameters should I use to find and crop objects in an image? 我的Facebook应用程序应使用哪种授权 - What kind of authorization I should use for my facebook application 我应该在神经网络中转置特征或权重吗? - Should I transpose features or weights in Neural network? 使用 Tensorflow 2.x 构建神经网络时应该使用哪一层? - Which layer should I use when I build a Neural Network with Tensorflow 2.x?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM