简体   繁体   English

如何在python中编写正则表达式来匹配这个?

[英]How to write a regex in python to match this?

The code is as follows: 代码如下:

#coding=utf-8

import re

str = "The output is\n"
str += "1) python\n"
str += "A dynamic language\n"
str += "easy to learn\n"
str += "2) C++\n"
str += "difficult to learn\n"
str += "3244) PHP\n"
str += "eay to learn\n"


pattern = r'^[1-9]+\) .*'
print re.findall(pattern,str,re.M)

The output is 输出是

['1) python', '2) C++', '3244) PHP']

However, I want to split it like this: 但是,我想将它拆分为:

['1) python\n'A dynamic language\n easy to learn\n'  2) C++\n difficult to learn\n', '3244) PHP\n easy to learn\n']

That is, ignore the first lines does not start with "number)",and when comes across a number, the following lines until next line start with a "number)" is consider to be the same group. 也就是说,忽略第一行不以“数字”开头,并且当遇到一个数字时,以下行直到下一行以“数字”开头“被认为是同一组。 How should I rewrite the pattern ? 我该如何重写模式?

>>> import re
>>> strs = 'The output is\n1) python\nA dynamic language\neasy to learn\n2) C++\ndifficult to learn\n3244) PHP\neay to learn\n'
>>> re.findall(r'\d+\)\s[^\d]+',strs)
['1) python\nA dynamic language\neasy to learn\n',
'2) C++\ndifficult to learn\n',
'3244) PHP\neay to learn\n']

你可以使用这个,允许数字,但后面没有右括号:

re.findall(r'\d+\)\s(?:\D+|\d+(?!\d*\)))*',str)

You need to add the python regex for whitespace into your pattern to account for the newlines. 您需要将空白的python正则表达式添加到模式中以考虑换行符。

Try this: 试试这个:

regex = r"[1-9]+\) .*\s.*"

\\s is the regex for any whitespace \\ s是任何空格的正则表达式

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM