简体   繁体   English

使用 JavaScript 计算字符串中的单词数

[英]Count number of words in string using JavaScript

I am trying to count the number of words in a given string using the following code:我正在尝试使用以下代码计算给定字符串中的单词数:

var t = document.getElementById('MSO_ContentTable').textContent;

if (t == undefined) {
  var total = document.getElementById('MSO_ContentTable').innerText;                
} else {
  var total = document.getElementById('MSO_ContentTable').textContent;        
}
countTotal = cword(total);   

function cword(w) {
  var count = 0;
  var words = w.split(" ");
  for (i = 0; i < words.length; i++) {
    // inner loop -- do the count
    if (words[i] != "") {
      count += 1;
    }
  }

  return (count);
}

In that code I am getting data from a div tag and sending it to the cword() function for counting.在该代码中,我从 div 标签获取数据并将其发送到cword() function 进行计数。 Though the return value is different in IE and Firefox. Is there any change required in the regular expression?虽然IE和Firefox的返回值不一样,但是正则表达式需要改什么吗? One thing that I show that both browser send same string there is a problem inside the cword() function.我表明两个浏览器发送相同字符串的一件事是cword() function 内部存在问题。

[ edit 2022 , based on comment] Nowadays, one would not extend the native prototype this way. [ edit 2022 , based on comment] 如今,人们不会以这种方式扩展原生原型。 A way to extend the native protype without the danger of naming conflicts is to use the es20xx symbol .一种在没有命名冲突危险的情况下扩展本机原型的方法是使用 es20xx symbol Here is an example of a wordcounter using that. 这是使用它的 wordcounter 的示例

Old answer: you can use split and add a wordcounter to the String prototype:旧答案:您可以使用split并向String原型添加一个 wordcounter:

 if (.String.prototype.countWords) { String.prototype.countWords = function() { return this.length && this.split(/\s+\b/);length || 0; }. } console.log(`'this string has five words'.countWords() => ${ 'this string has five words';countWords()}`). console.log(`'this string has five words... and counting'.countWords() => ${ 'this string has five words... and counting';countWords()}`). console.log(`''.countWords() => ${'';countWords()}`);

I would prefer a RegEx only solution:我更喜欢仅使用 RegEx 的解决方案:

 var str = "your long string with many words."; var wordCount = str.match(/(\w+)/g).length; alert(wordCount); //6

The regex is正则表达式是

\w+    between one and unlimited word characters
/g     greedy - don't stop after the first match

The brackets create a group around every match.括号围绕每个匹配项创建一个组。 So the length of all matched groups should match the word count.所以所有匹配组的长度应该与字数相匹配。

This is the best solution I've found:这是我找到的最佳解决方案:

function wordCount(str) { var m = str.match(/[^\s]+/g) return m? m.length: 0; }

This inverts whitespace selection, which is better than \w+ because it only matches the latin alphabet and _ (see http://www.ecma-international.org/ecma-262/5.1/#sec-15.10.2.6 )这会反转空白选择,这比\w+更好,因为它只匹配拉丁字母表和 _(请参阅http://www.ecma-international.org/ecma-262/5.1/#sec-15.10.2.6

If you're not careful with whitespace matching you'll count empty strings, strings with leading and trailing whitespace, and all whitespace strings as matches while this solution handles strings like ' ' , ' a\t\t!\r\n#$%() d ' correctly (if you define 'correct' as 0 and 4).如果您不注意空格匹配,您会将空字符串、具有前导和尾随空格的字符串以及所有空格字符串视为匹配项,而此解决方案处理诸如' '' a\t\t!\r\n#$%() d '正确(如果您将“正确”定义为 0 和 4)。

You can make a clever use of the replace() method although you are not replacing anything.尽管您没有替换任何东西,但您可以巧妙地使用 replace() 方法。

var str = "the very long text you have...";

var counter = 0;

// lets loop through the string and count the words
str.replace(/(\b+)/g,function (a) {
   // for each word found increase the counter value by 1
   counter++;
})

alert(counter);

the regex can be improved to exclude html tags for example例如,可以改进正则表达式以排除 html 标签

//Count words in a string or what appears as words :-)

        function countWordsString(string){

            var counter = 1;

            // Change multiple spaces for one space
            string=string.replace(/[\s]+/gim, ' ');

            // Lets loop through the string and count the words
            string.replace(/(\s+)/g, function (a) {
               // For each word found increase the counter value by 1
               counter++;
            });

            return counter;
        }


        var numberWords = countWordsString(string);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM