简体   繁体   English

切割结构化的numpy数组

[英]cutting structured numpy arrays

Hello numpy masters of the world. 您好世界的numpy大师。 I would like to find a better solution for the following task. 我想为以下任务找到更好的解决方案。 One has a structured array: 一个具有结构化数组:

from pylab import *
data    = np.zeros((3,),dtype=( [('value1', 'i4'), ('value2', 'f4'),('name','S6')] ) )
data[:] = [(1,2.,'Hello'),(2,3.,"World"), (4, 5, "year")]

I often find my self searching in the data array for a line like this: 我经常发现自己在数据数组中搜索像这样的一行:

line  = data[data["name"]=="World"]

The next thing I would like to achive is to strip the line by "name". 我想实现的下一件事情是用“名称”删除行。 So I do: 所以我做:

names = line.dtype.names
sline = line[ [name for name in names][:-1] ]

And to get the values 并获得价值

result = sline[0]
print result
(2, 3.0)

As you can see this is a relative complicated and not very readable way. 如您所见,这是一种相对复杂且不太可读的方式。 The problem is, that a line of a structured array is not slicable (line[0][:-1] does not work). 问题是结构化数组的一行不可切片(line [0] [:-1]不起作用)。 This leads to the line with names and the need to loop over them to be able to cut. 这就导致了带有名称的行,并且需要遍历它们以便剪切。 All this is way easier if data is a normal numpy array without the structure, because one can use the powerfull cutting syntax here. 如果数据是不带结构的普通numpy数组,则所有这些操作都将更加容易,因为这里可以使用功能强大的剪切语法。 On the other hand I like the possibility of finding values in a structured array by calling there names rather then cryptic index numbers. 另一方面,我喜欢通过在其中调用名称而不是隐式索引号来在结构化数组中查找值的可能性。 It represents my data just to well to give it up. 它代表了我的数据,正好可以放弃它。 So is there a nicer way of cutting down a structured array in rows and columns without converting it to a normal numpy array? 那么,有没有一种更好的方法来减少行和列中的结构化数组而不将其转换为普通的numpy数组?

Cheerse Cheerse

I find this easier with Pandas DataFrame s: 我发现使用Pandas DataFrame更容易:

import pandas as pd
a=pd.DataFrame(data)
a
   value1  value2   name
0       1       2  Hello
1       2       3  World
2       4       5   year

a[a.name=='World']
   value1  value2   name
1       2       3  World

Since your data is structured, wouldn't it be easier to access the values like this? 由于您的数据是结构化的,访问这样的值难道不是很容易吗?

# get that array row
data[data['name']=='World'][0]
(2, 3.0, 'World')

# get individual value
data[data['name']=='World'][0][0]
2

Updated 更新

To access multiple records, you can also use slicing or even list comprehension, something like this: 要访问多个记录,您还可以使用切片甚至列表理解,例如:

data[data['name'] != ''][1:]
array([(2, 3.0, 'World'), (4, 5.0, 'year')], 
      dtype=[('value1', '<i4'), ('value2', '<f4'), ('name', 'S6')])

data[data['name'] != ''][1:][1][0]
4

print [x[1] for x in data[data['name'] != ''][1:]]
[3.0, 5.0]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM