简体   繁体   中英

Count number of words in string using JavaScript

I am trying to count the number of words in a given string using the following code:

var t = document.getElementById('MSO_ContentTable').textContent;

if (t == undefined) {
  var total = document.getElementById('MSO_ContentTable').innerText;                
} else {
  var total = document.getElementById('MSO_ContentTable').textContent;        
}
countTotal = cword(total);   

function cword(w) {
  var count = 0;
  var words = w.split(" ");
  for (i = 0; i < words.length; i++) {
    // inner loop -- do the count
    if (words[i] != "") {
      count += 1;
    }
  }

  return (count);
}

In that code I am getting data from a div tag and sending it to the cword() function for counting. Though the return value is different in IE and Firefox. Is there any change required in the regular expression? One thing that I show that both browser send same string there is a problem inside the cword() function.

[ edit 2022 , based on comment] Nowadays, one would not extend the native prototype this way. A way to extend the native protype without the danger of naming conflicts is to use the es20xx symbol . Here is an example of a wordcounter using that.

Old answer: you can use split and add a wordcounter to the String prototype:

 if (.String.prototype.countWords) { String.prototype.countWords = function() { return this.length && this.split(/\s+\b/);length || 0; }. } console.log(`'this string has five words'.countWords() => ${ 'this string has five words';countWords()}`). console.log(`'this string has five words... and counting'.countWords() => ${ 'this string has five words... and counting';countWords()}`). console.log(`''.countWords() => ${'';countWords()}`);

I would prefer a RegEx only solution:

 var str = "your long string with many words."; var wordCount = str.match(/(\w+)/g).length; alert(wordCount); //6

The regex is

\w+    between one and unlimited word characters
/g     greedy - don't stop after the first match

The brackets create a group around every match. So the length of all matched groups should match the word count.

This is the best solution I've found:

function wordCount(str) { var m = str.match(/[^\s]+/g) return m? m.length: 0; }

This inverts whitespace selection, which is better than \w+ because it only matches the latin alphabet and _ (see http://www.ecma-international.org/ecma-262/5.1/#sec-15.10.2.6 )

If you're not careful with whitespace matching you'll count empty strings, strings with leading and trailing whitespace, and all whitespace strings as matches while this solution handles strings like ' ' , ' a\t\t!\r\n#$%() d ' correctly (if you define 'correct' as 0 and 4).

You can make a clever use of the replace() method although you are not replacing anything.

var str = "the very long text you have...";

var counter = 0;

// lets loop through the string and count the words
str.replace(/(\b+)/g,function (a) {
   // for each word found increase the counter value by 1
   counter++;
})

alert(counter);

the regex can be improved to exclude html tags for example

//Count words in a string or what appears as words :-)

        function countWordsString(string){

            var counter = 1;

            // Change multiple spaces for one space
            string=string.replace(/[\s]+/gim, ' ');

            // Lets loop through the string and count the words
            string.replace(/(\s+)/g, function (a) {
               // For each word found increase the counter value by 1
               counter++;
            });

            return counter;
        }


        var numberWords = countWordsString(string);

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM