简体   繁体   中英

Validate that input string does not exceed word limit

I want to count the words in a specific string so that I can validate it and prevent users to write more than, for example, 100 words.

I wrote this function, but I don't think it's effective enough. I used the explode function with space as a delimiter, but what if the user puts two spaces instead of one? Can you give me a better way to do that?

function isValidLength($text , $length){
  
   $text  = explode(" " , $text );
   if(count($text) > $length)
          return false;
   else
          return true;
}

Maybe str_word_count could help

http://php.net/manual/en/function.str-word-count.php

$Tag  = 'My Name is Gaurav'; 
$word = str_word_count($Tag);
echo $word;

Try this:

function get_num_of_words($string) {
    $string = preg_replace('/\s+/', ' ', trim($string));
    $words = explode(" ", $string);
    return count($words);
}

$str = "Lorem ipsum dolor sit amet";
echo get_num_of_words($str);

This will output: 5

You can use the built in PHP function str_word_count . Use it like this:

$str = "This is my simple string.";
echo str_word_count($str);

This will output 5.

If you plan on using special characters in any of your words, you can supply any extra characters as the third parameter.

$str = "This weather is like el ninã.";
echo str_word_count($str, 0, 'àáã');

This will output 6.

str_count_words has his flaws. it will count underscores as separated words like this_is two words:

You can use the next function to count words separated by spaces even if theres more than one between them.

function count_words($str){

    while (substr_count($str, "  ")>0){
        $str = str_replace("  ", " ", $str);
    }
    return substr_count($str, " ")+1;
}


$str = "This   is  a sample_test";

echo $str;
echo count_words($str);
//This will return 4 words;

This function uses a simple regex to split the input $text on any non-letter character:

function isValidLength($text, $length) {
    $words = preg_split('#\PL+#u', $text, -1, PREG_SPLIT_NO_EMPTY);
    return count($words) <= $length;
}

This ensures that is works correctly with words separated by multiple spaces or any other non-letter character. It also handles unicode (eg accented letters) correctly.

The function returns true when the word count is less than $length.

Use preg_split() instead of explode(). Split supports regular expressions.

Using substr_count to Count the number of any substring occurrences. for finding number of words set $needle to ' '. int substr_count ( string $haystack , string $needle)

$text = 'This is a test';
echo substr_count($text, 'is'); // 2


echo substr_count($text, ' ');// return number of occurance of words

If you need greater utility for defining "a word" in the context of your application, then a call of preg_match_all() returns its matches count. If you need multibyte support then add the unicode pattern modifier. \pL and \pM are letters and letter marks to err on the side of inclusivity. Consider this a starting place and understand that the regex rules of what is "a word" can be tightened or loosened as needed.

This solution is multibyte-safe.

Code: ( Demo ) ( Regex101 Demo )

function isValidLength($text, $length) {
    return $length <= preg_match_all("~[\pL\pM'-]+~u", $text);
}

Alternatively, if it is a required field and you only need to count space-delimited "non-whitespace substrings", then you can just write:

if (preg_match("~^\s*\S+(\s+\S+){0,99}\s*$~", $text)) { ... }

or

if (preg_match("~^\S+(\s+\S+){0,99}$~", trim($text))) { ... }

There are n-1 spaces between n objects so there will be 99 spaces between 100 words, so u can choose and average length for a word say for example 10 characters, then multiply by 100(for 100 words) then add 99(spaces) then you can instead make the limitation based on number of characters(1099).

function isValidLength($text){

if(strlen($text) > 1099)

     return false;

else return true;

}

I wrote a function which is better than str_word_count because that PHP function counts dashes and other characters as words.

Also my function addresses the issue of double spaces, which many of the functions other people have written don't take account for.

As well this function handles HTML tags. Where if you had two tags nested together and simply used the strip_tags function this would be counted as one word when it's two. For example: <h1>Title</h1>Text or <h1>Title</h1><p>Text</p>

Additionally, I strip out JavaScript first other wise the code within the <script> tags would be counted as words.

Lastly, my function handles spaces at the beginning and end of a string, multiple spaces, and line breaks, return characters, and tab characters.

###############
# Count Words #
###############
function count_words($str)
{
 $str = preg_replace("/[^A-Za-z0-9 ]/","",strip_tags(str_replace('<',' <',str_replace('>','> ',str_replace(array("\n","\r","\t"),' ',preg_replace('~<\s*\bscript\b[^>]*>(.*?)<\s*\/\s*script\s*>~is','',$str))))));
 while(substr_count($str,'  ')>0)
 {
  $str = str_replace('  ',' ',$str);
 }
 return substr_count(trim($str,' '),' ')+1;
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM