简体   繁体   中英

What kind of neural network should I use for extract key information from a sentence for RDF rules?

I am working on my paper, and one of the tasks is to extract the company name and location from the sentence of the following type:

"Google shares resources with Japan based company."

Here, I want the output to be "Google Japan". The sentence structure may also be varied like "Japan based company can access the resources of Google". I have tried an Attention based NN, but the error rate is around 0.4. Can anyone give me a little bit of hint about which model I should use?

And I printed out the validation process like this: validation print

And I got the graphs of the loss and accuracy: lass and accuracy

It shows that the val_acc is 0.99. Is this mean my model is pretty good at predicting? But why do I get 0.4 error rate when I use my own validation function to show error rate? I am very new to ML. What does the val_acc actually mean?

Here is my model:

encoder_input = Input(shape=(INPUT_LENGTH,))
decoder_input = Input(shape=(OUTPUT_LENGTH,))

encoder = Embedding(input_dict_size, 64, input_length=INPUT_LENGTH, mask_zero=True)(encoder_input)
encoder = LSTM(64, return_sequences=True, unroll=True)(encoder)
encoder_last = encoder[:, -1, :]

decoder = Embedding(output_dict_size, 64, input_length=OUTPUT_LENGTH, mask_zero=True)(decoder_input)
decoder = LSTM(64, return_sequences=True, unroll=True)(decoder, initial_state=[encoder_last, encoder_last])

attention = dot([decoder, encoder], axes=[2, 2])
attention = Activation('softmax')(attention)

context = dot([attention, encoder], axes=[2, 1])
decoder_combined_context = concatenate([context, decoder])

output = TimeDistributed(Dense(64, activation="tanh"))(decoder_combined_context)  # equation (5) of the paper
output = TimeDistributed(Dense(output_dict_size, activation="softmax"))(output)

model = Model(inputs=[encoder_input, decoder_input], outputs=[output])
model.compile(optimizer='adam', loss="binary_crossentropy", metrics=['accuracy'])

es = EarlyStopping(monitor='val_loss', mode='min', verbose=1, patience=200, min_delta=0.0005)

I will preface this by saying that if you're new to ML, I would advise you to learn more "traditional" algorithms before turning towards Neural Networks. Furthermore, your task is so specific to company names and locations that using Latent Semantic Analysis (or a similar statistical method) to generate your embeddings and an SVM to determine which words are relevant might give you better results than neural networks with less experimentation and less training time.

Now with all that being said, here's what I can gather. If I understand correctly, you have a separate, second validation set on which you get a 40% error rate. All the numbers in the screenshots are pretty good, which leads me to two possible conclusions: Either your second validation set is very different from your first one and you're suffering from a bit of overfitting, or there's a bug somewhere in your code that leads Keras to believe your model is doing great when in fact it isn't. (Bear in mind I'm not very familiar with Keras so I don't know how likely the latter option is)

Now as for the model itself, your task is clearly extractive , meaning that your model doesn't need to paraphrase anything or come up with something that isn't in the source text. Your model should take that into account, and should never make mistakes like confusing India with New-Zealand or Tecent with Google. You can probably base your model on recent work in extractive summarization , which is a fairly active field (moreso than keyword and keyphrase extraction). Here 'sa recent article which uses a neural attention model, you can use Google Scholar to easily find more.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM