简体   繁体   English

从一段文字中找出JavaScript中的关键字

[英]Finding out keywords in javascript from a paragraph of text

How can I take keywords from an input eg ( textarea , text ,...) in JavaScript and then store that keyword in an array of string that the length of words are greater than or equal to seven let me show you an example I have the following paragraph in HTML: 如何从JavaScript中的输入(例如textareatext ,...)中获取关键字,然后将该关键字存储在单词长度大于或等于7的string array中,让我为您展示一个示例HTML中的以下段落:

<html>
 ...
   <body>
          <textarea id="keyword" cols="10" rows="20" placeholder="write content here"></textarea>
   </body>
 </html>

and I fill the textarea with the following content: 我用以下内容填充textarea
The Eloquent ORM included with Laravel provides a beautiful, simple ActiveRecord implementation for working with your database. Laravel附带的Eloquent ORM为使用您的数据库提供了一个漂亮,简单的ActiveRecord实现。
and then I want to store the keywords in JavaScript for instance: 然后我想将关键字存储在JavaScript中,例如:

<javascript>
       var keywords = ['Eloquent', 'included', 'provides', 'Laravel','beautiful', 'ActiveRecord', 'implementation', 'working', 'database'];
</javascript>

how can I do this? 我怎样才能做到这一点?

The function that you are trying to create is not possible, because if you look out any informative sentence, then you will find billions of keywords like this. 您尝试创建的功能是不可能的,因为如果您查找任何翔实的句子,那么您将发现数十亿个这样的关键字。 Javascript or PHP language has no clue of whether these are special words, its just us humans who see these as keywords. Javascript或PHP语言不知道这些单词是否是特殊单词,只有把它们视为关键词的人类。 So, for this problem, you might want to define the array yourself. 因此,对于此问题,您可能需要自己定义数组。

*There can be one way. *可以有一种方法。 You can check each word of the paragraph, and if the first letter is a capital letter, then you might want to store that(because most of your keywords begin with a capital letter). 您可以检查段落中的每个单词,如果第一个字母是大写字母,则可能要存储该单词(因为大多数关键字都以大写字母开头)。 Work out the code for this logic. 编制此逻辑的代码。

I would try POSTagging. 我会尝试POSTagging。

This might point you in the right direction https://github.com/dariusk/pos-js 这可能会为您指明正确的方向https://github.com/dariusk/pos-js

What I would do, is determine the Part of Speech for all of these and then run through the list and add into an array which ones are Nouns and Adjectives. 我要做的是确定所有这些词的词性,然后遍历列表,将名词和形容词加到数组中。

It won't be perfect but it'll be a start. 这不是完美的,但这将是一个开始。 Here's a code example 这是一个代码示例

var pos = require('pos');
var words = new pos.Lexer().lex('The Eloquent ORM included with Laravel provides a beautiful, simple ActiveRecord implementation for working with your database');
var tagger = new pos.Tagger();
var taggedWords = tagger.tag(words);
var output=[];
for (i in taggedWords) {
    var taggedWord = taggedWords[i];
    var word = taggedWord[0];
    var tag = taggedWord[1];
    //check if tag is a noun or adjective
    if(tag.indexOf('NN')==0 || tag.indexOf('JJ')==0)
       output.push(word)
}
console.log(output)

My output of this was 我的输出是

[ 'Eloquent',
'ORM',
'Laravel',
'beautiful',
'simple',
'ActiveRecord',
'implementation',
'database' ]

Basically, only simple turned up too 基本上也只有简单的出现

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM