python - 在括号之间返回文本

Question

I have file contains several lines of strings written as : 我有文件包含几行字符串写为：

[(W)40(indo)25(ws )20(XP)111(, )20(with )20(the )20(fragment )20(enlar)18(ged )20(for )20(clarity )20(on )20(Fig. )] TJ

I need the text inside the parentheses only. 我只需要括号内的文字。 I try to use the following code : 我尝试使用以下代码：

import re

readstream = open ("E:\\New folder\\output5.txt","r").read()

stringExtract = re.findall('\[(.*?)\]', readstream, re.DOTALL)
string = re.compile ('\(.*?\)')
stringExtract2 =  string.findall (str(stringExtract))

but some strings (or text) not exist in the output eg, for the above string the word (with) not found in the output. 但是输出中不存在一些字符串（或文本），例如，对于上面的字符串，输出中找不到单词（with）。 Also the arrangement of strings differs from the file, eg, for strings (enlar) and (ged ) above, the second one (ged ) appeared before (enlar), such as : ( ged other strings ..... enlar) How I can fix these problems? 字符串的排列也与文件不同，例如，对于上面的字符串（放大）和（ged），第二个（ged）出现在（放大）之前，例如：（ged其他字符串.....放大）我能解决这些问题吗？

Answer 1

Without regexp: 没有正则表达式：

[p.split(')')[0] for p in s.split('(') if ')' in p]

Output: 输出：

['W', 'indo', 'ws ', 'XP', ', ', 'with ', 'the ', 'fragment ', 'enlar', 'ged ', 'for ', 'clarity ', 'on ', 'Fig. ']

Answer 2

Try this: 尝试这个：

import re

readstream = open ("E:\\New folder\\output5.txt","r").read()
stringExtract2 = re.findall(r'\(([^()]+)\)', readstream)

Input: 输入：

readstream = r'[(W)40(indo)25(ws )20(XP)111(, )20(with )20(the )20(fragment )20(enlar)18(ged )20(for )20(clarity )20(on )20(Fig. )]'

Output: 输出：

['W', 'indo', 'ws ', 'XP', ', ', 'with ', 'the ', 'fragment ', 'enlar', 'ged ', 'for ', 'clarity ', 'on ', 'Fig. ']

Answer 3

findall looks like your friend here. findall看起来像你的朋友。 Don't you just want: 你不想要：

re.findall(r'\(.*?\)',readstream)

returns: 收益：

['(W)',
 '(indo)',
 '(ws )',
 '(XP)',
 '(, )',
 '(with )',
 '(the )',
 '(fragment )',
 '(enlar)',
 '(ged )',
 '(for )',
 '(clarity )',
 '(on )',
 '(Fig. )']

Edit : as @vikramis showed, to remove the parens, use: re.findall(r'\\((.*?)\\)', readstream) . 编辑：正如@vikramis所示，要删除parens，请使用： re.findall(r'\\((.*?)\\)', readstream) 。 Also, note that it is common (but not requested here) to trim trailing whitespace with something like: 此外，请注意，通过以下方式修剪尾随空格是很常见的（但不是在此请求）：

re.findall(r'\((.*?) *\)', readstream)

Answer 4

your first problem is 你的第一个问题是

stringExtract = re.findall('\[(.*?)\]', readstream, re.DOTALL)

I have no idea why you are doing this and im pretty sure you dont want to do this 我不知道你为什么这样做，我很确定你不想这样做

try this instead 试试这个

 readstream = "[(W)40(indo)25(ws )20(XP)111(, )20(with )20(the )20(fragment )20(enlar)18(ged )20(for )20(clarity )20(on )20(Fig. )] TJ"
 stringExtract = re.findall('\(([^)]+)\)', readstream, re.DOTALL)

which says find everything inside parenthesis that is not a closing parenthesis 其中说找到括号内的所有内容都不是右括号

python - 在括号之间返回文本

问题描述

4 个解决方案

解决方案1
6 2014-12-02 23:19:27

解决方案2
3 2014-12-02 23:11:52

Input: 输入：

Output: 输出：

解决方案3
2 2014-12-02 23:08:59

解决方案4
0 2014-12-02 23:05:16

python - 在括号之间返回文本

问题描述

4 个解决方案

解决方案1 6 2014-12-02 23:19:27

解决方案2 3 2014-12-02 23:11:52

Input: 输入：

Output: 输出：

解决方案3 2 2014-12-02 23:08:59

解决方案4 0 2014-12-02 23:05:16

解决方案1
6 2014-12-02 23:19:27

解决方案2
3 2014-12-02 23:11:52

解决方案3
2 2014-12-02 23:08:59

解决方案4
0 2014-12-02 23:05:16