r.findall的Python正则表达式

Question

我正在使用findall分隔文本。

我以re.findall（r'（。 ？）（\\ $。 ？\\ $）'这个表达式开始，但是在找到最后一段文本后它没有给我数据，我错过了'6 \\ n \\ n “

我如何获得最后一段文字？

这是我的python代码：

#!/usr/bin/env python

import re

allData = '''
1
2
3 here Some text in here 
$file1.txt$
4 Some text in here and more  $file2.txt$
5 Some text $file3.txt$ here  
$file3.txt$
6

'''

for record in re.findall(r'(.*?)(\$.*?\$)|(.*?$)',allData,flags=re.DOTALL) :
print repr(record)

我得到的输出是：

('\n1\n2\n3 here Some text in here \n', '$file1.txt$', '')
('\n4 Some text in here and more  ', '$file2.txt$', '')
('\n5 Some text ', '$file3.txt$', '')
(' here  \n', '$file3.txt$', '')
('', '', '\n6\n')
('', '', '')
('', '', '')

我真的很想要这个输出：

('\n1\n2\n3 here Some text in here \n', '$file1.txt$')
('\n4 Some text in here and more  ', '$file2.txt$')
('\n5 Some text ', '$file3.txt$')
(' here  \n', '$file3.txt$')
('\n6\n', '', )

背景信息，以防您需要查看大图。

如果您感兴趣，我将用python重写。 我已经控制了其余代码。 我从findall中得到了太多东西。

https://discussions.apple.com/message/21202021#21202021

Answer 1

如果我从该Apple链接中正确理解，您想要执行以下操作：

import re


allData = '''
1
2
3 here Some text in here
$file1.txt$
4 Some text in here and more  $file2.txt$
5 Some text $file3.txt$ here
$file3.txt$
6

'''


def read_file(m):
    return open(m.group(1)).read()

# Sloppy matching :D
# print re.sub("\$(.*?)\$",  read_file, allData)
# More precise.
print re.sub("\$(file\d+?\.txt)\$",  read_file, allData)

编辑奥斯卡建议让比赛更加精确。

即。 将文件名放在$ s之间，并读取文件中的数据，这就是上面的操作。

输出示例：

1
2
3 here Some text in here

I'am file1.txt

4 Some text in here and more  
I'am file2.txt

5 Some text 
I'am file3.txt
 here

I'am file3.txt

6

文件：

==> file1.txt <==

I'am file1.txt

==> file2.txt <==

I'am file2.txt

==> file3.txt <==

I'am file3.txt

Answer 2

要获得输出，您需要将模式限制为2个捕获组。 （如果使用3个捕获组，则每个“记录”中将有3个元素）。

您可以将第二组设为可选，这应该可以完成工作：

r'([^$]*)(\$.*?\$)?'

Answer 3

这是使用findall解决替换问题的一种方法。

def readfile(name):
    with open(name) as f:
        return f.read()

r = re.compile(r"\$(.+?)\$|(\$|[^$]+)")

print "".join(readfile(filename) if filename else text 
    for filename, text in r.findall(allData))

Answer 4

这是部分解决您的问题

import re

allData = '''
1
2
3 here Some text in here 
$file1.txt$
4 Some text in here and more  $file2.txt$
5 Some text $file3.txt$ here  
$file3.txt$
6

'''

for record in re.findall(r'(.*?)(\$.*?\$)|(.*?$)',allData.strip(),flags=re.DOTALL) :
    print  [ x for x in record if x]

产生产出

['1\n2\n3 here Some text in here \n', '$file1.txt$']
['\n4 Some text in here and more  ', '$file2.txt$']
['\n5 Some text ', '$file3.txt$']
[' here  \n', '$file3.txt$']
['\n6']
[]

避免最后一个空列表

for record in re.findall(r'(.*?)(\$.*?\$)|(.*?$)',allData.strip(),flags=re.DOTALL) :
    if ([ x for x in record if x] != []):
        print  [ x for x in record if x]

r.findall的Python正则表达式

问题描述

4 个解决方案

解决方案1
2 2013-02-26 20:50:37

解决方案2
1 2013-02-26 20:56:57

解决方案3
1 2013-02-26 21:46:56

解决方案4
0 2013-02-26 20:56:20

r.findall的Python正则表达式

问题描述

4 个解决方案

解决方案1 2 2013-02-26 20:50:37

解决方案2 1 2013-02-26 20:56:57

解决方案3 1 2013-02-26 21:46:56

解决方案4 0 2013-02-26 20:56:20

解决方案1
2 2013-02-26 20:50:37

解决方案2
1 2013-02-26 20:56:57

解决方案3
1 2013-02-26 21:46:56

解决方案4
0 2013-02-26 20:56:20