[英]Skipping over a regex pattern contained within the pattern I'm looking for
我正在解析包含以^[
開頭並以]
結尾的腳注的 Pandoc-markdown 文件,其中一些包含嵌入的[]
。 例如:
...
to explain how the feature came to be as it is, so you can use generics more
effectively.^[Angelika Langer's [Java Generics FAQ](
www.angelikalanger.com/GenericsFAQ/JavaGenericsFAQ.html) as well as her other
writings (together with Klaus Kreft) were invaluable during the preparation of
this chapter.]
...
(在 Python 中)的簡單方法:
re.compile(r"\^\[.+?\]", flags=re.DOTALL)
在第一個]
處停止,因此不會捕獲整個腳注。 有沒有辦法傳遞嵌套的[]
子句?
您可以使用 PyPi 正則表達式模塊使用子程序來做到這一點,您只需要在設置組邊界時小心:
import regex
text = r"""...
to explain how the feature came to be as it is, so you can use generics more
effectively.^[Angelika Langer's [Java Generics FAQ](
www.angelikalanger.com/GenericsFAQ/JavaGenericsFAQ.html) as well as her other
writings (together with Klaus Kreft) were invaluable during the preparation of
this chapter.]
..."""
print( [x.group(1) for x in regex.finditer(r'\^(\[(?:[^][]++|(?1))*])', text)] )
輸出:
["[Angelika Langer's [Java Generics FAQ](\nwww.angelikalanger.com/GenericsFAQ/JavaGenericsFAQ.html) as well as her other\nwritings (together with Klaus Kreft) were invaluable during the preparation of\nthis chapter.]"]
\\^
- ^
字符(\\[(?:[^][]++|(?1))*])
- 第 1 組:
\\[
- 一個[
字符(?:[^][]++|(?1))*
- 0 次或多次出現:
[^][]++
- 除]
和[
之外的一個或多個字符|
- 或者(?1)
- 第 1 組模式]
- 一個]
字符。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.