简体   繁体   中英

Ruby regex counting characters

I am trying to create a regex in ruby that matches against strings with 10 characters which are not special characters ie would match with \w . So far I have come up with this: /\w{10,}/ but the issue is that it will only count a consecutive sequence of word characters. I want to match any string which counts up to have at least 10 "word" characters. Is this possible? I am fairly new to regex as a whole so any help would be appreciated.

If I understood correctly, this should work:

/(?:\w[^\w]*){9,}\w/

Explanation:

We start with a single

\w

We want to capture all the other characters until another \w , hence:

\w[^\w]*

[^<list of chars>] matches any character other than listed in the brackets, so [^\w] means any character that is not a word character. * denotes 0 or more. The above will match "a-- " , "b" and "c!" in "a-- bc!" string.

Since we need 10 \w, we will match 9 (or more) groups like that, followed by a single \w

(\w[^\w]*){9,}\w

We don't really care for captures here (especially since ruby will ignore repeated group captures anyway, so we make the group non-capturing)

(?:\w[^\w]*){9,}\w

Alternatively we could just use simpler regex:

(?:\w[^\w]*){10,}

But it will also cover characters after the last word character in a string - not sure if this is required here.

Match anywhere in the string:

/\w(?:\W*\w){9,19}/
/(?:\W*\w){10,20}/

Validate a string of 10 to 20 characters long:

/\A(?:\W*\w){10,20}\W*\z/

Prefer non-capturing groups , particularly when extracting found matches.

Watch out for ^ and $ that mark up start and end of the line respectively in Ruby's regex.

EXPLANATION

--------------------------------------------------------------------------------
  \A                       the beginning of the string
--------------------------------------------------------------------------------
  (?:                      group, but do not capture (between 10 and
                           20 times (matching the most amount
                           possible)):
--------------------------------------------------------------------------------
    \W*                      non-word characters (all but a-z, A-Z, 0-
                             9, _) (0 or more times (matching the
                             most amount possible))
--------------------------------------------------------------------------------
    \w                      word characters (a-z, A-Z, 0-9, _) 
--------------------------------------------------------------------------------
  ){10,20}                 end of grouping
--------------------------------------------------------------------------------
  \W*                      non-word characters (all but a-z, A-Z, 0-
                           9, _) (0 or more times (matching the most
                           amount possible))
--------------------------------------------------------------------------------
  \z                       the end of the string

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM