Python Get text between keywords

Question

I would like to get the text between certain keywords [en] and [ja]

So for the following example:

[en]
Text
- Example
- Example
- Example

[ja]
Text
 - 例
 - 例
 - 例

I need it to return only:

Text
- Example
- Example
- Example

I have tried using regex:

([en])(.|\n)+?([ja])

But it only grabs the first 2 characters of first line. What am I doing wrong here?

Answer 1

You may use this regex for capturing text between [en] and [ja] :

\[en]\n((?:.*\n)*?)\n\[ja]

RegEx Details:

\[en]\n : Match [en] followed by a line break
((?:.*\n)+?) : Match anything followed by a line break. Repeat this group 1+ times (lazy matching) and capture matched text in group #1
\n\[ja] : Match line break followed by [ja]

Answer 2

Captures all the text between [en] and [ja]

(?<=\[en\]\n)(?:(?:.*\n)+?)(?=\n\[ja\])