简体   繁体   English

我需要将python列表拆分为几个python列表,但是新列表需要在某些字符串之间包含字段

[英]I need to split a python list into several python lists but the new lists need to contain fields between certain strings

I have a fairly large Python 2.7 list containing strings like this: 我有一个相当大的Python 2.7列表,其中包含这样的字符串:

biglist = ['A','B1','C00','D','A','1','2000','A','X','3','1','C','D','A','B','C']

I need to cut this up in several seperate lists cutted each time it finds a 'A' string in the list and then that new list contains everything until the next 'A'. 我需要将它分割成几个单独的列表,每次它在列表中找到一个“ A”字符串时就剪切,然后新列表包含所有内容,直到下一个“ A”为止。 So the result is this: 所以结果是这样的:

list1 = ['A','B1','C00','D']    
list2 = ['A','1','2000']
list3 = ['A','X','3','1','C','D']
list4 = ['A','B','C']
listx = ...

The amount of newly created list is also varying. 新创建列表的数量也有所不同。

I'm completely stuck on this and it's completely over my head, I research all day can't find anything. 我完全坚持这一点,这完全让我烦恼,我整天研究都找不到任何东西。 Thank you for helping me out. 谢谢你帮我 I use python2.7 我使用python2.7

EDITED: MY STRINGS IN THE BIGLIST ARE NOT ALL 1 CHAR, THEY ARE DIFFERENT IN SIZE, THANK YOU FOR THE HELP. 编辑:我的名单中的字符串不是全部1个字符,它们的大小不同,谢谢您的帮助。

it's can be done fairly simply with a generator 可以使用发电机相当简单地完成

def split(biglist):
    last = None
    for x in biglist:
        if x == "A":
            if last:
                yield last
            last = [x]
        else:
            if last is None: # in case the list didn't start with 'A'
                last = []
            last.append(x)

for x in split(biglist):
    print x

['A', 'B', 'C', 'D']
['A', '1', '2']
['A', 'X', '3', '1', 'C', 'D']

This might not be elegant, but should do the trick: 这可能并不优雅,但是应该可以解决问题:

biglist = ['A','B','C','D','A','1','2','A','X','3','1','C','D','A','B','C']

Make the list into a string first: 首先将列表放入字符串中:

bigstring=" ".join(biglist)

Split on "A", sneakily insert A again 分割成“ A”,再偷偷插入A

finallist=["A"+l for l in bigstring.split("A") if l]

Output: 输出:

['ABCD ', 'A 1 2 ', 'AX 3 1 CD ', 'AB C'] ['ABCD','A 1 2','AX 3 1 CD','AB C']

To access those strings, just do finallist[index] , eg finallist[0] gives you 'ABCD ' . 要访问这些字符串,只需执行finallist[index] ,例如, finallist[0]会给您'ABCD' You can also put them all into variables like so: 您还可以将它们全部放入变量中,如下所示:

var1, var2, var3, var4 = finallist

To turn strings into lists, just do [l.split() for l in finallist] 要将字符串转换为列表,只需对[l.split() for l in finallist]执行[l.split() for l in finallist]

Aside from the other excellent suggestions, you could write a generator which will give you things you can enumerate over late. 除了其他出色的建议,您还可以编写一个生成器,该生成器将为您提供您以后可以枚举的内容。 This could be tidier, but... 这可能比较整齐,但是...

def group(stuff):
  item = []
  for thing in stuff:
    if thing != 'A':
      item.append(thing)
      continue
    if len(item) > 0:
      yield item
    item = ['A']
  yield item

if __name__ == '__main__':
  biglist = ['A','B','C','D','A','1','2','A','X','3','1','C','D','A','B','C']
  for i in group(biglist):
    print i

I'd probably use itertools.groupby : 我可能会使用itertools.groupby

from itertools import groupby

def group_stuff(iterable, partition='A'):
    out = []
    for k, v in groupby(iterable, key=lambda x: x != partition):
        if not k:
            out = list(v)
        else:
            out.extend(v)
            yield out
            out = []
    if out:
        yield out



# Test cases
biglist = ['A','B','C','D','A','1','2','A','X','3','1','C','D','A','B','C']

for item in group_stuff(biglist):
    print(item)

print('*' * 80)
biglist.append('A')
for item in group_stuff(biglist):
    print(item)

print('*' * 80)
biglist.pop(0)
for item in group_stuff(biglist):
    print(item)

Basically, we notice that in your list we have 2 separate groups... The first group is "It's an A!", the second group is "It isn't an A". 基本上,我们注意到您的列表中有2个独立的组...第一个组是“这是一个A!”,第二个组是“这不是一个A”。 groupby will partition your iterable into those two groups trivially. groupby会将您的可迭代对象简单地划分为这两个组。 All that remains is a little logic to merge the groups appropriately (adding a "It's an A!" group -- if it exists -- to the start of an "It's not an A" group). 剩下的就是适当合并组的逻辑(将“ It's A!”组(如果存在)添加到“ It's a A”组的开头)。

If you have consecutive 'A' in your list, this will give you a list that has more than one 'A' at the beginning. 如果您的列表中有连续的'A' ,这将为您提供一个列表,开头有多个'A' If that's a problem, we can modify the logic in the if not k: block slightly to yield all but the last value as a list... 如果这是一个问题,我们可以稍微修改if not k:块中的逻辑以产生除最后一个值以外的所有值作为列表...

if not k:
    values = list(v)
    for item in values[:-1]:
        yield [item]
    out = [values[-1]]

As for setting that output as names in the local namespace, there are LOTS of questions around here which point out that this is generally a bad idea. 至于将输出设置为本地名称空间中的名称,此处有很多问题指出,这通常是个坏主意。 Here's an external post which talks about it. 这是一个谈论它的外部帖子。 The gist of it is that you'll do much better if you just use a to hold the data. 要点是,如果仅使用a来保存数据,则您会做得更好。 Instead of 代替

list0 = ...
list1 = ...

do: 做:

lst[0] = ...
lst[1] = ...

etc. Your code will end up being much easier to work with. 等等。您的代码最终将变得更容易使用。

First take your big-list and join it together as a string. 首先选择您的大名单,然后将其作为字符串加入。

new_list = ''.join(biglist)

then you have new_list = 'ABCDA12AX31CDABC' 那么你有new_list = 'ABCDA12AX31CDABC'

split up new_list on 'A' 在“ A”上拆分new_list

split_list = new_list.split('A')

then you have split_list = ['', 'BCD', '12', 'X31CD', 'BC'] 那么你有split_list = ['', 'BCD', '12', 'X31CD', 'BC']

then add the 'A's back in there 然后在其中添加“ A”

final_list = ['A'+x for x in split_list if x]

alltoghether: 在一起:

new_list = ''.join(biglist)
split_list = new_list.split('A')
final_list = ['A'+x for x in split_list if x]

>>> final_list
['ABCD', 'A12', 'AX31CD', 'ABC']

or in neato one line format: 或整齐的单行格式:

final_list = ['A'+x for x in ''.join(biglist).split('A') if x]

slap it into a dictionary: 把它拍成字典:

dict_lists = {}
for i,v in enumerate(final_list):
    dict_lists['list{}'.format(i)] = v

and access them like 并像访问它们

>>> dict_lists['list0']
'ABCD'

You could put the results in a dictionary with 'list1' , 'list2' , ... as keys. 您可以将结果放入以'list1''list2'...作为键的字典中。 The defaultdict creates a new key with an empty list every time an A is encountered in the list. 每次在列表中遇到A时, defaultdict都会使用空列表创建一个新密钥。 The items followinng 'A' are added to this list until another 'A' is encountered. 紧随'A'之后的项目将添加到此列表中,直到遇到另一个'A'为止。

from collections import defaultdict
from itertools import count

biglist = ['A','B','C','D','A','1','2','A','X','3','1','C','D','A','B','C']

c = count(1)
d = defaultdict(list)

for i in biglist:
    if i == 'A':
        j = str(next(c))
    d['list'+ j].append(i)

print(d)
# defaultdict(<class 'list'>, {'list2': ['A', '1', '2'], 'list3': ['A', 'X', '3', '1', 'C', 'D'], 'list1': ['A', 'B', 'C', 'D'], 'list4': ['A', 'B', 'C']})

The first list can be accessed via d['list1'] and generally d['listn'] where n is the number of lists in the dictionary values. 可以通过d['list1']访问第一个列表,通常可以通过d['list1'] d['listn'] ,其中n是字典值中列表的数量。

You can convert the list of characters in a string and use the split() function to divide the string at each occurrence of 'A'. 您可以转换字符串中的字符列表,并使用split()函数在每次出现“ A”时对字符串进行分割。

biglist = ['A','B','C','D','A','1','2','A','X','3','1','C','D','A','B','C']
lists = [list('A'+x) for x in ''.join(s).split("A") if x]

will give you a list of characters as required. 将为您提供所需的字符列表。

>>> lists
 [['A', 'B', 'C', 'D'], ['A', '1', '2'], ['A', 'X', '3', '1', 'C', 'D'], ['A', 'B', 'C']]

One option to use groupby from itertools: 使用itertools中的groupby一种选择:

# create a group variable by looping through the list
from itertools import groupby
acc, grp = 0, []
for e in biglist:
    acc += (e == 'A')
    grp.append(acc)

# split the original list by the group variable
[[i[0] for i in g] for _, g in groupby(zip(biglist, grp), lambda x: x[1])]

# [['A', 'B', 'C', 'D'],
#  ['A', '1', '2'],
#  ['A', 'X', '3', '1', 'C', 'D'],
#  ['A', 'B', 'C']]

We can also use pandas : 我们也可以使用pandas

import pandas as pd
s = pd.Series(biglist)
[list(g) for _, g in s.groupby((s == 'A').cumsum())]

# [['A', 'B', 'C', 'D'],
#  ['A', '1', '2'],
#  ['A', 'X', '3', '1', 'C', 'D'],
#  ['A', 'B', 'C']]

This will create module level variables list1, .. listn. 这将创建模块级变量list1,.. listn。

If it possible to use list of lists or dict of lists you should prefer other answers. 如果可以使用列表列表列表 字典,则应选择其他答案。

This answer base on python function globals that return dict of current global namespace. 此答案基于返回当前全局名称空间的dict的python函数globals It modifying this dict to create variables on fly. 它修改此字典以动态创建变量。 There is also same function for getting local variables, but there is "note" in documentation with warning that is bad idea to change this dict. 也有用于获取局部变量的相同功能,但是文档中有“注释”,并带有警告,警告不要更改此字典。 However there is no such warning for globals so, i hope, the code is safe. 但是,对于globals没有这种警告,因此,我希望代码是安全的。

 biglist = ['A','B','C','D','A','1','2','A','X','3','1','C','D','A','B','C']

    last_arr_index = 1;
    tmp_list = []
    for idx, letter in enumerate( biglist ):
        if letter == 'A' and idx > 0:
            globals()[ 'list' + str(last_arr_index) ] = tmp_list
            last_arr_index+=1
            tmp_list = ['A']
        else:
            tmp_list.append( letter )

    print( list1 )
    print( list2 )
    print( list3 )

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM