如何在Python中过滤/清理列表

Question

I have a list that has text and numbers as well as empty values. 我有一个包含文本和数字以及空值的列表。 I'm looking to take: 我正在考虑：

products = [[], [], [], [], [], [], [], [], [], [], ['productid="6836518"', 'productid="5965878"', 'productid="3851171"'], ['productid="6455623"'], [], ['productid="8024914"', 'productid="2871360"', 'productid="6694729"', 'productid="6760262"'], [], [], ['productid="6466698"', 'productid="5340641"', 'productid="6071996"', 'productid="5379225"'], ['productid="6683916"', 'productid="6690577"', 'productid="7117851"'], ['productid="7094467"'], ['productid="6628351"'], ['productid="5897930"'], ['productid="6812437"', 'productid="5379225"'], ['productid="7918467"', 'productid="7918466"'], []]

And return something like: 并返回类似：

products2 =  [6836518, 5965878, 3851171, 6455623, 8024914, 2871360, 6694729, 6760262, 6466698, 5340641, 6071996, 5379225, 6683916, 6690577, 7117851, 7094467, 6628351, 5897930, 6812437, 5379225, 7918467, 7918466]

Answer 1

So examine your data structure. 因此，请检查您的数据结构。 You have a list of lists, where those inner lists contain zero or elements that look like 'productid="0123456"' and you want to get those digits out. 您有一个列表列表，其中那些内部列表包含零或看起来像'productid="0123456"'元素，并且您想将这些数字取出。

You should be able to use itertools.chain for this: 您应该能够使用itertools.chain ：

products2 = []

for el in itertools.chain.from_iterable(products):
    if 'productid' in el:
        _, num = el.split('=')
        num = int(num.strip('"'))
        products2.append(num)

If you might have productid='12345' as well as ..."12345" you can strip both types of quotes instead with num = int(num.strip('"\\'')) (note the escaped single quote, which I think looks cleaner than the equivalent """"'""" ) 如果您可能具有productid='12345'和..."12345" ，则可以使用num = int(num.strip('"\\''))代替两种引号（请注意转义的单引号，我认为看起来比等效的""""'""" ）干净

Answer 2

import re data = [[], [], [], [], [], [], [], [], [], [], ['productid="6836518"', 'productid="5965878"', 'productid="3851171"'], ['productid="6455623"'], [], ['productid="8024914"', 'productid="2871360"', 'productid="6694729"', 'productid="6760262"'], [], [], ['productid="6466698"', 'productid="5340641"', 'productid="6071996"', 'productid="5379225"'], ['productid="6683916"', 'productid="6690577"', 'productid="7117851"'], ['productid="7094467"'], ['productid="6628351"'], ['productid="5897930"'], ['productid="6812437"', 'productid="5379225"'], ['productid="7918467"', 'productid="7918466"'], []] clean = [] for l in data: for item in l: clean.append(int(re.search('\d+', item).group(0))) print(clean)

Answer 3

This single line solution should work using re : 此单行解决方案应使用re ：

import re
product = [int(re.search("\d+",e).group()) for l in products for e in l]

result of product : product结果：

Answer 4

You can try this: 您可以尝试以下方法：

With List Comprehension: 具有列表理解功能：

tmp = [ j for i in products for j in i]
result = [ int(i.split('=')[1].replace('"','')) for i in tmp]

print(result) # will give the desired output

Extending the list comprehension: 扩展列表理解：

result= []

for i in products:
  if i:
    for j in i:
      tmp = j.split('=')
      result.append(int(tmp[1].replace('"','')))

print(result)

如何在Python中过滤/清理列表

问题描述

4 个解决方案

解决方案1
0 2019-02-14 23:09:03

解决方案2
0 2019-02-14 23:09:05

解决方案3
0 已采纳 2019-02-14 23:13:03

解决方案4
0 2019-02-15 08:54:49

如何在Python中过滤/清理列表

问题描述

4 个解决方案

解决方案1 0 2019-02-14 23:09:03

解决方案2 0 2019-02-14 23:09:05

解决方案3 0 已采纳 2019-02-14 23:13:03

解决方案4 0 2019-02-15 08:54:49

解决方案1
0 2019-02-14 23:09:03

解决方案2
0 2019-02-14 23:09:05

解决方案3
0 已采纳 2019-02-14 23:13:03

解决方案4
0 2019-02-15 08:54:49