简体   繁体   中英

How to parse this string in python using regex?

I have the following string in python:

text = "vagrant  11450  4344  0 Feb22 pts/2    00:00:28 python run.py abc"

I want to capture the text after time field that is "python run.py abc"

I am using the following regex but it is not working

 [\d:]+ (.)*

You may use

\d+:\d+\s+(.*)

See the regex demo .

Details

  • \\d+ - 1 or more digits
  • : - a colon
  • \\d+ - 1 or more digits
  • \\s+ - 1 or more whitespace chars
  • (.*) - Group 1 (the value you need to access using .group(1) ): any 0+ chars other than line break chars, as many as possible (all the rest of the line).

See the Python demo :

import re
text = "vagrant  11450  4344  0 Feb22 pts/2    00:00:28 python run.py abc"
m = re.search(r'\d+:\d+\s+(.*)', text)
if m:
    print(m.group(1)) # => python run.py abc

With re.search() function:

import re

text = "vagrant  11450  4344  0 Feb22 pts/2    00:00:28 python run.py abc"
result = re.search(r'(?<=(\d{2}:){2}\d{2} ).*', text).group()

print(result)

The output:

python run.py abc

Without RE:

text = "vagrant  11450  4344  0 Feb22 pts/2    00:00:28 python run.py abc"
text=text.split(":")[-1][3:]

Output:

python run.py abc

You can use re.split and regex :\\d{2}:\\d{2}\\s+ .

text = 'vagrant  11450  4344  0 Feb22 pts/2    00:00:28 python run.py abc'
str = re.split(r':\d{2}:\d{2}\s+', text)[1]

Output: python run.py abc

Code demo

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM