简体   繁体   中英

How to extract substring between two characters/substrings

I have a string:

string1 = "my name is fname.lname and i live in xyz. my lname is not common"

I want to extract a substring from string1 that is anything between the first empty space " " and ".lname" . In the case above, the answer should be "fname.lname"`.

string1[/(?<= ).*?(?=\.lname\b)/]
  #=> "name is fname" 

(?<= ) is a positive lookbehind that requires the first character matched be immediately preceded by a space, but that space is not part of the match.

(?=\\.lname\\b) is a positive lookahead that requires the last character matched is immediately followed by the string ".lname" 1 , which is itself followed by a word break ( \\b ), but that string is not part of the match. That ensures, for example, that "\\.lnamespace" is not matched. If that should be matched, remove \\b .

.*? matches zero more characters ( .* ), non-greedily ( ? ). (Matches are by default greedy .) The non-greedy qualifier has the following effect:

"my name is fname.lname and fname.lname"[/(?<= ).*(?=\.lname\b)/]
  #=> "name is fname.lname and fname" 
"my name is fname.lname and fname.lname"[/(?<= ).*?(?=\.lname\b)/]
  #=> "name is fname" 

In other words, the non-greedy (greedy) match matches the first (last) occurrence of ".lname" in the string.

This could alternatively be written with a capture group and no lookarounds:

string1[/ (.*?)\.lname\b/, 1]
  #=> "name is fname"

This regular expression reads, "mactch a space followed by zero or more characters, saved in capture group 1, followed by the string ".name" followed by a word break. This uses the form of String#[] that has two arguments, a reference to a capture group.

Yet another way follows.

string1[(string1 =~ / /)+1..(string1 =~ /\.lname\b/)-1]
  #=> "name is fname"

1 The period in ".lname" must be escaped because an unescaped period in a regular expression (except in a character class) matches any character.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM