简体   繁体   中英

Python Get text between keywords

I would like to get the text between certain keywords [en] and [ja]

So for the following example:

[en]
Text
- Example
- Example
- Example

[ja]
Text
 - 例
 - 例
 - 例

I need it to return only:

Text
- Example
- Example
- Example

I have tried using regex:

([en])(.|\n)+?([ja])

But it only grabs the first 2 characters of first line. What am I doing wrong here?

You may use this regex for capturing text between [en] and [ja] :

\[en]\n((?:.*\n)*?)\n\[ja]

RegEx Demo

RegEx Details:

  • \[en]\n : Match [en] followed by a line break
  • ((?:.*\n)+?) : Match anything followed by a line break. Repeat this group 1+ times (lazy matching) and capture matched text in group #1
  • \n\[ja] : Match line break followed by [ja]

Captures all the text between [en] and [ja]

(?<=\[en\]\n)(?:(?:.*\n)+?)(?=\n\[ja\])

Regex working link

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM