从Python 3的列表中查找和提取模式字符串

Question

I have a data type list in Python 3 looks like this. 我在Python 3中有一个数据类型列表，看起来像这样。

list1 = ['1128=9,9=639, 75=20140110,268=6,START,22=8,48=49798,83=63663,271=7,1020=7,5799=1,START,48=49798,83=63664,451=0,1003=2,5799=1','1128=9,9=6389, 75=20140119, START, 22=8,48=49798, 271=0.75,1020=7,5799=1,START,22=8,48=49798,83=63664,451=0,1020=10,5799=1,START,22=8,48=49798,271=63664,451=0,1020=10,5799=1']

The length of the list1 is 2. list1的长度为2。

I want to first extract all useful strings and omit all others. 我想首先提取所有有用的字符串，然后省略所有其他字符串。

I would like to keep everything with 52=, START, 75=, 271=, and 451=. 我想将所有内容保留为52 =，START，75 =，271 =和451 =。

Then the desired output should be: 然后，所需的输出应为：

list2 = ['75=20140110, START,271=7,START,451=0','75=20140119, START, 271=0.75,START,451=0, START, 271=63664,451=0']

The last step is I would like to split the list and create a new list. 最后一步是我想分割列表并创建一个新列表。

Within each element, I would like to paste the substring '75=.....' to the substrings after the word ' START'. 在每个元素中，我想将子字符串“ 75 = .....”粘贴到单词“ START”之后的子字符串中。

The desired output looks like. 所需的输出看起来像。

list3 = ['75=20140110, START,271=7', '75=20140110,START,451=0','75=20140119, START, 271=0.75','75=20140119,START,451=0', '75=20140119,START, 271=63664,451=0']

Now, it is a list of 5 elements. 现在，它是5个元素的列表。 We have 2 substring STARTs in list2, element 1, and 3 substring STARTs in list2, element 2. 我们在元素2的list2中有2个子字符串START，在元素2的list2中有3个子字符串START。

I am new to Python, thank you so much for the help. 我是Python新手，非常感谢您的帮助。

Answer 1

This should solve your first problem: 这应该可以解决您的第一个问题：

(You did not specify whether your use-case is sensitive to spaces so I ignored them) （您没有指定用例是否对空格敏感，因此我忽略了它们）

list1 = [
    '1128=9,9=639, 75=20140110,268=6,START,22=8,48=49798,83=63663,271=7,1020=7,5799=1,START,48=49798,83=63664,451=0,1003=2,5799=1','1128=9,9=6389, 75=20140119, START, 22=8,48=49798, 271=0.75,1020=7,5799=1,START,22=8,48=49798,83=63664,451=0,1020=10,5799=1,START,22=8,48=49798,271=63664,451=0,1020=10,5799=1'
]

texts_to_keep = ['52=', 'START', '75=', '271=', '451=']

# Split the list on commas to work with the data easier
list1_split = [item.split(',') for item in list1]

# Create a new list of the same length as your old list1
list1_new = [[] for item in list1]
for items, list1_list in zip(list1_split, list1_new):
    # Grab each string in the sub list
    for item in items:
        # Now check if your substrings are in the original string
        for text_to_keep in texts_to_keep:
            # If it is, keep it
            if text_to_keep in item:
                list1_list.append(item)

final_list1 = [
    ','.join(sub_list) for sub_list in list1_new
]

Which gives the output: 给出输出：

[' 75=20140110,START,271=7,START,451=0', ' 75=20140119, START, 271=0.75,START,451=0,START,271=63664,451=0']

It should be possible to do this with a list comprehension for performance but it got very ugly so I went with the simple implementation above. 可以通过性能的列表理解来做到这一点，但是它变得非常难看，因此我采用了上面的简单实现。

As per your second question, as far as I can tell, you're sometimes adding the substring '75=...' and sometimes not and I can't discern the pattern. 根据您的第二个问题，据我所知，您有时会添加子字符串'75 = ...'，有时却不会，并且我无法识别模式。

Answer 2

This should solve your first problem with help of list comprehension 这应该在列表理解的帮助下解决您的第一个问题

 f = ['1128=9,9=639, 75=20140110,268=6,START,22=8,48=49798,83=63663,271=7,1020=7,5799=1,START,48=49798,83=63664,'
     '451=0,1003=2,5799=1',
     '1128=9,9=6389, 75=20140119, START, 22=8,48=49798, 271=0.75,1020=7,5799=1,START,22=8,48=49798,83=63664,'
     '451=0,1020=10,5799=1,START,22=8,48=49798,271=63664,451=0,1020=10,5799=1']

def convert(li):
    text = ['52=', 'START', '75=', '271=', '451=']
    return [", ".join([y for y in x.split(',') for z in text if z in y]) for x in li]

print(convert(f))
#output [' 75=20140110, START, 271=7, START, 451=0', ' 75=20140119,  START,  271=0.75, START, 451=0, START, 271=63664, 451=0']

从Python 3的列表中查找和提取模式字符串

问题描述

2 个解决方案

解决方案1
2 2018-07-10 22:05:46

解决方案2
1 已采纳 2018-07-11 09:21:08

从Python 3的列表中查找和提取模式字符串

问题描述

2 个解决方案

解决方案1 2 2018-07-10 22:05:46

解决方案2 1 已采纳 2018-07-11 09:21:08

解决方案1
2 2018-07-10 22:05:46

解决方案2
1 已采纳 2018-07-11 09:21:08