按数组numpy过滤

Question

I am trying to filter my ndarray by another array I have collected (with the same values) 我正在尝试通过收集的另一个数组（具有相同的值）来过滤我的ndarray

My main ndarray looks like 我的主要ndarray看起来像

[['Name' 'Col1' 'Count']
 ['test' '' '413']
 ['erd' ' ' '60']
 ..., 
 ['Td1' 'f' '904']
 ['Td2' 'K' '953']
 ['Td3' 'r' '111']]

I have another list with various matching names 我还有另一个带有各种匹配名称的列表

names = ['Td1','test','erd']

What I'd Like to Do 我想做什么

I'd like to use the list names as a filter against the ndarray above? 我想将列表名称用作针对上述ndarray的过滤器？

What I've Tried 我尝试过的

name_filter = main_ndarray[:,0] == names

This does not work 这行不通

What I'd Expect 我期望什么

[['Name' 'Col1' 'Count']
 ['test' '' '413']
 ['erd' ' ' '60']
 ['Td1' 'f' '904']]

Answer 1

You can use the filter function too. 您也可以使用filter功能。

cats_array = numpy.array(
 [['Name' ,'Col1', 'Count'],
 ['test', '' ,'413'],
 ['erd' ,' ' ,'60'],
 ['Td1' ,'f' ,'904'],
 ['Td2' ,'K' ,'953'],
 ['Td3' ,'r', '111']]
 )

 names = ['Td1','test','erd']

 filter(lambda x: x[0] in names, cats_array)

gives: 给出：

[array(['test', '', '413'],
       dtype='|S5'), array(['erd', ' ', '60'],
       dtype='|S5'), array(['Td1', 'f', '904'],
       dtype='|S5')]

Answer 2

Consider using Pandas for this kind of data: 考虑将Pandas用于此类数据：

import pandas as pd

data = [['Name', 'Col1', 'Count'],
        ['test', '', '413'],
        ['erd', ' ', '60'],
        ['Td1', 'f', '904'],
        ['Td2', 'K', '953'],
        ['Td3', 'r', '111']]

df = pd.DataFrame(data[1:], columns=data[0])
names = ['Td1','test','erd']
result = df[df.Name.isin(names)]

Results: 结果：

>>> df
   Name Col1 Count
0  test        413
1   erd         60
2   Td1    f   904
3   Td2    K   953
4   Td3    r   111
>>> result
   Name Col1 Count
0  test        413
1   erd         60
2   Td1    f   904
>>>

References 参考文献

Answer 3

I would also go with @YXD's Pandas solution but just for the sake of completeness I also provide a simple solution based on list comprehension: 我也将使用@YXD的Pandas解决方案，但是出于完整性考虑，我还提供了一个基于列表理解的简单解决方案：

data = [['Name', 'Col1', 'Count'],
 ['test', '', '413'],
 ['erd', ' ', '60'],
 ['Td1', 'f', '904'],
 ['Td2', 'K', '953'],
 ['Td3', 'r', '111']]

names = ['Td1', 'test', 'erd']

# select all sublist of data
res = [l for l in data if l[0] in names]

# insert the first row of data
res.insert(0, data[0])

which then gives you the desired output: 然后为您提供所需的输出：

[['Name', 'Col1', 'Count'],
 ['test', '', '413'],
 ['erd', ' ', '60'],
 ['Td1', 'f', '904']]

按数组numpy过滤

问题描述

3 个解决方案

解决方案1
1 2015-08-03 13:58:45

解决方案2
1 已采纳 2015-08-03 14:03:59

解决方案3
1 2015-08-03 14:41:05

按数组numpy过滤

问题描述

3 个解决方案

解决方案1 1 2015-08-03 13:58:45

解决方案2 1 已采纳 2015-08-03 14:03:59

解决方案3 1 2015-08-03 14:41:05

解决方案1
1 2015-08-03 13:58:45

解决方案2
1 已采纳 2015-08-03 14:03:59

解决方案3
1 2015-08-03 14:41:05