简体   繁体   English

如何使用Apache OpenNLP分析Java Web应用程序中的文本?

[英]How to use Apache OpenNLP to analyze text in a java web application?

I'm developing a java web application where users can enter their request to the web application through a text box.. I need to analyze user's text inputs (customer requests) and compare it with the web application database and give (view) the suitable suggestions or results to the customer ? 我正在开发一个Java Web应用程序,用户可以在其中通过文本框输入对Web应用程序的请求。我需要分析用户的文本输入(客户请求)并将其与Web应用程序数据库进行比较,然后给出(查看)合适的内容对客户的建议或结果? Is it possible with OpenNLP ? OpenNLP是否可能? please give me some advises. 请给我一些建议。

This sounds like a "More Like This" kind of use case rather than an NLP use case, but it depends on some details . 这听起来像是一种“更像这样”用例,而不是NLP用例,但这取决于一些细节。 . .

If you need to extract specific product names from the customer request, then look them up, then you could train a Named Entity Recognition model (NER) on your data using OpenNLP's name finder. 如果需要从客户请求中提取特定的产品名称,然后对其进行查找,则可以使用OpenNLP的名称查找器在数据上训练命名实体识别模型(NER)。 Although it may be overkill for this use case, because unless you have a ton of data with a ton of product names, you could probably just use a Regex match approach on a solid list of product names. 尽管对于这种用例来说可能是过高的,但是因为除非您拥有大量带有大量产品名称的数据,否则您可能只需要在可靠的产品名称列表上使用Regex匹配方法即可。

If you need to "fuzzy match" the whole customer request to other customer requests or to product descriptions or something, you would likely be better off using something like ElasticSearch to index your database entries, then pass in the customer request to the "more like this" function, which would return you N best matches (scored) on the fly. 如果您需要使整个客户请求与其他客户请求或产品描述或其他内容“模糊匹配”,则最好使用类似ElasticSearch之类的方法来索引数据库条目,然后将客户请求传递给“更多类似此功能”,可以即时返回N个最佳匹配(得分)。 In fact, I would recommend this approach first, since it requires no model maintenance, no training data, no feature extraction etc that comes with NER. 实际上,我会首先推荐这种方法,因为它不需要NER附带的模型维护,训练数据,特征提取等。

HTH 高温超导

Here's a link to the ElasticSearch MLT function Link 这是ElasticSearch MLT函数的链接链接

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM