简体   繁体   中英

Regex to match a word after the last dot in a string

I need to extract MyActivity from the following text:

These are the steps that the user did before sending the bug, all the user touches and interactions are recorded here. \n\n 2014-07-15 13:46:02.323+0200 UTC      com.bug.demo.demoapplication.MyActivity was started \n 2014-07-15 13:46:27.026+0200 UTC      com.bug.demo.demoapplication.LoginActivity was started \n 2014-07-15 13:46:35.108+0200 UTC      In activity com.bug.demo.ss.ss.MyActivity.ss: View(email_field) of type android.widget.EditText received a click event \n 2014-07-15 13:46:36.692+0200 UTC      In activity com.bug.demo.demoapplication.MyActivity: View(password_field) of type android.widget.EditText received a click event \n 2014-07-15 13:47:02.922+0200 UTC      In activity com.bug.demo.demoapplication.MyActivity: View(login_begin) of type android.widget.Button received a click event \n 2014-07-15 13:47:25.013+0200 UTC    

I need the last part of any com.bug which is MyActivity to be matched from that string with Regex.

Here is what I tried so far:

(\.)\S*[^\W]

Which matches the whole com.bug.demo.demoapplication...

How to refine it to only match MyActivity whatever the number of dots before it.

\bcom\.bug(?:\.\S+)?\.(\w+)

You can use this and grab the group 1 .See demo.

https://regex101.com/r/vV1wW6/18

This works on that string: /([az]+\.)+(\w+)/m

The string you need will be in group 2.

Assuming that you already got a hold onto "com.bug.demo.demoapplication.MyActivity", you can then use something like this : /\.([az]|[AA]|[0-9])*$/ . You can make it more robust by including special characters if you expect them in the string.

This will result into ".MyActivity". You can then get rid of the leading dot character using this : What is the easiest way to remove the first character from a string?

To match the last dot in the string , you can use /\.(?=[^.]*\z)/ pattern in Ruby.

To match any word , you can use either \w+ (one or more letters, digits or underscores), or \p{L}+ (any one or more Unicode letters), or (?:\p{L}\p{M}*+)+ (any one or more Unicode letters optionally followed with diacritic marks), or \S+ (one or more non-whitespace chars) pattern.

Thus, matching the word after last dot in the string can be done with

\.(\w+)(?=[^.]*\z)
\.(\p{L}+)(?=[^.]*\z)
\.((?:\p{L}\p{M}*+)+)(?=[^.]*\z)
\.(\S+)(?=[^.]*\z)

See demo #1 , demo #2 , demo #3 and a differently matching demo #4 .

If you do not want to meddle with groups, remove the capturing parentheses and add \K after \. , that will omit all text matched so far from the current match memory buffer:

\.\K\w+(?=[^.]*\z)

See this regex demo .

Now, to solve the current problem , you can just use

text[/com\.bug(?!.*com\.bug)\S*\.\K\w+/m]

See the regex demo and the Ruby demo . Details :

  • com\.bug - a com.bug string
  • (?!.*com\.bug) - no more con.bug substrings in the remaining part of the string to the right
  • \S* - any zero or more non-whitespaces
  • \. - a dot
  • \K - omit the text matched so far
  • \w+ - one or more letters, digits or underscores.

The m flag makes . match across line endings.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM