在python中的列表中解析以某些字符开头且长度为N的字符串

Question

I tried to emulate this Find strings of length 10 with regex 我试图用正则表达式模拟此查找长度为10的字符串

with this 有了这个

for char in updated_metabolite:
    found_all = re.findall('^cpd.{5}$', updated_metabolite)

the list updated_metabolites looks like this before running the above code: 运行上面的代码之前，列表updated_metabolites如下所示：

cpd00001;cpd00009;cpd00015;cpd00041;cpd00095;cpd00982;cpd02333
cpd00001;cpd00003;cpd00004;cpd00067;cpd00075;cpd00985
cpd00003;cpd00004;cpd00067;cpd15560;cpd15561
cpd00005;cpd00006;cpd00067;cpd14938;cpd17051
cpd00001;cpd00002;cpd00003;cpd00004;cpd00008;cpd00009;cpd00067;cpd00149;cpd03913;cpd03914
cpd00005;cpd00006;cpd11669;cpd17097
cpd00005;cpd00006;cpd00067;cpd00129;cpd02431
cpd00001;cpd00015;cpd00067;cpd00129;cpd00858;cpd00982
cpd00005;cpd00006;cpd00011;cpd00017;cpd00060;cpd00067;cpd00791;cpd02083;cpd03091;

Answer 1

不要使用行锚的开头和结尾（ ^和$ ），因为在文件中同一行中有多个匹配项：

re.findall(r'cpd\d{5}', updated_metabolite)

Answer 2

If what you want to do is create a list out of semicolon-separated data, you should use re.split instead. 如果re.split分号分隔的数据中创建列表，则应使用re.split 。

lst = re.split(';|\n', updated_metabolite)

Output 输出量

['cpd00001', 'cpd00009', 'cpd00015', ...]

在python中的列表中解析以某些字符开头且长度为N的字符串

问题描述

2 个解决方案

解决方案1
0 2018-06-26 19:24:06

解决方案2
0 2018-06-26 19:28:59

Output 输出量

在python中的列表中解析以某些字符开头且长度为N的字符串

问题描述

2 个解决方案

解决方案1 0 2018-06-26 19:24:06

解决方案2 0 2018-06-26 19:28:59

Output 输出量

解决方案1
0 2018-06-26 19:24:06

解决方案2
0 2018-06-26 19:28:59