regex to match string with a minimum number of words

Question

I need a regex to match a string only if it contains at least X words (where a word is defined as any continuous non-whitespace sequence).

I am using re.findall() .

Answer 1

Hmm, you could use the character class \\S+ to designate a word.

\\S is equivalent to [^\\s] which is itself equivalent to [^ \\v\\t\\f\\n\\r] (in order I typed them: white space, vertical tab, horizontal tab, form feed, newline, carriage return).

[^ ... ] indicates a negated class, where all characters will be matched, except those inside the class.

Now, for what you're trying to do, I would rather use re.match like so:

re.match(r'\s*\S+(?:\s+\S+){X-1,}', text_to_validate)

(?:\\s+\\S+) matches space(s) followed by a word.

{X-1,} means that the group (?:\\s+\\S+) should appear at least X-1 times to match. If X=4, then it becomes {3,} .

ideone demo

Alternate, split on spaces and count the number of elements:

re.split(r"\s+", text_to_validate)

ideone demo

Answer 2

import re

subject = """I need a regex to match a string only if it contains at least X words.
Where a word is defined as any continuous non-whitespace sequence.
I am using Python 3 and re.findall()"""

result = re.findall(r"([\S]+)", subject)

if len(result) > 5:
    print "yes"
else:
    print "no"

http://labs.codecademy.com/

regex to match string with a minimum number of words

Question

2 answers

solution1
3 ACCPTED 2013-12-17 11:55:42

solution2
-2 2013-12-17 12:24:00

regex to match string with a minimum number of words

Question

2 answers

solution1 3 ACCPTED 2013-12-17 11:55:42

solution2 -2 2013-12-17 12:24:00

solution1
3 ACCPTED 2013-12-17 11:55:42

solution2
-2 2013-12-17 12:24:00