简体   繁体   中英

Some help with a regular expression in Python

I am very new to both Python and regular expressions, so bear with me. I have some text that looks like this:

Change 421387 on 2011/09/20 by person@domain.com

    Some random text    including line breaks

Change 421388 on 2011/09/20 by person2@domain.com

    Some other random text  including line breaks

Now, I want to use python along with a regular expression to split this into a tuple. In the end I want the tuple to contain two elements.

Element 0:

Change 421387 on 2011/09/20 by person@domain.com

    Some random text    including line breaks

Element 1:

Change 421388 on 2011/09/20 by person2@domain.com

    Some other random text  including line breaks

I realize that I can use regex to recognize the pattern formed by:

  • the word "Change"
  • a space
  • some digits
  • some text
  • a date in the form ####/##/##
  • some text
  • @
  • some text
  • line break

I know it could be broken down further, but I think recognizing these things is good enough for my purposes.

Once I come up with a regular expression for that pattern, how can I use it to split the string into a tuple of strings?

With a lookahead assertion.

>>> re.split(r'(?=\s+Change \d+ on \d{4})\s+', '''    Change 421387 on 2011/09/20 by person@domain.com
...     Some random text including line breaks
...     Change 421388 on 2011/09/20 by person2@domain.com
...     Some other random text including line breaks''')
['', 'Change 421387 on 2011/09/20 by person@domain.com\n    Some random text including line breaks', 'Change 421388 on 2011/09/20 by person2@domain.com\n    Some other random text including line breaks']

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM