简体   繁体   English

如何使用 Keras 预测 NLP 模型的 function?

[英]How to use Keras predict function for NLP models?

I have created an NLP classification model with keras with no problems with my model showing 83.5% accuracy upon evaluation.我创建了一个 NLP 分类 model 与 keras 与我的 Z20F35E630DAF44DBFA4C3F68F53999D 评估准确度没有问题。 However, when I want to use my model to predict a new set of tokenized words, my model returns x number of arrays where x is the number of tokens in a tokenized sentence I have given to my model to predict However, when I want to use my model to predict a new set of tokenized words, my model returns x number of arrays where x is the number of tokens in a tokenized sentence I have given to my model to predict

` here is the code example ` 这里是代码示例

    toPredict = np.array([1,2])

    prediction = self.model.predict(toPredict)
    print(prediction)

` The values 1 and 2 are obviously just token values, but this will return an output of ' ` 值 1 和 2 显然只是标记值,但这将返回 output 的 '

    [[0.24091144 0.20921658 0.3415633  0.20830865]
    [0.20159791 0.46421158 0.19968869 0.13450184]]

' I may be missing something, but i thought the output would be only 1 array to classify the whole tokenized sentence, not each word individually. ' 我可能遗漏了一些东西,但我认为 output 将只有 1 个数组来对整个标记化句子进行分类,而不是单独对每个单词进行分类。 Am I feeding in the model a badly formatted input?我是否在 model 中输入格式错误的输入? Please help!请帮忙!

to predict you should feed model in the same shape that training data fed into model;预测您应该以与输入 model 的训练数据相同的形状输入 model; so the sequence must have been in 2-dim shape and even the same length as you set before when padded sequences.因此序列必须是 2-dim 形状,甚至与您在填充序列时设置的长度相同。 you could tf.expand_dims(toPredict, 0) and then feed it into model.您可以 tf.expand_dims(toPredict, 0) 然后将其输入 model。

for instance here i will define a function for prediction;例如,在这里我将定义一个 function 用于预测;

#def prediction
def predict_text(#define input text and model
                 input_text, tokenizer, model, 
                 #define tokenizer maximum length of sequence
                 maxlen_seq, padding = 'post', truncating = 'post'
                 ):
    
    #prediction
    text = str(input_text)
    sequence = tokenizer.texts_to_sequences([text])
    sequence = keras.preprocessing.sequence.pad_sequences(sequence, maxlen = maxlen_seq,
                                                          padding = padding, truncating = truncating)
    predict = model.predict(sequence)
    
    return predict

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM