简体   繁体   中英

cutting structured numpy arrays

Hello numpy masters of the world. I would like to find a better solution for the following task. One has a structured array:

from pylab import *
data    = np.zeros((3,),dtype=( [('value1', 'i4'), ('value2', 'f4'),('name','S6')] ) )
data[:] = [(1,2.,'Hello'),(2,3.,"World"), (4, 5, "year")]

I often find my self searching in the data array for a line like this:

line  = data[data["name"]=="World"]

The next thing I would like to achive is to strip the line by "name". So I do:

names = line.dtype.names
sline = line[ [name for name in names][:-1] ]

And to get the values

result = sline[0]
print result
(2, 3.0)

As you can see this is a relative complicated and not very readable way. The problem is, that a line of a structured array is not slicable (line[0][:-1] does not work). This leads to the line with names and the need to loop over them to be able to cut. All this is way easier if data is a normal numpy array without the structure, because one can use the powerfull cutting syntax here. On the other hand I like the possibility of finding values in a structured array by calling there names rather then cryptic index numbers. It represents my data just to well to give it up. So is there a nicer way of cutting down a structured array in rows and columns without converting it to a normal numpy array?

Cheerse

I find this easier with Pandas DataFrame s:

import pandas as pd
a=pd.DataFrame(data)
a
   value1  value2   name
0       1       2  Hello
1       2       3  World
2       4       5   year

a[a.name=='World']
   value1  value2   name
1       2       3  World

Since your data is structured, wouldn't it be easier to access the values like this?

# get that array row
data[data['name']=='World'][0]
(2, 3.0, 'World')

# get individual value
data[data['name']=='World'][0][0]
2

Updated

To access multiple records, you can also use slicing or even list comprehension, something like this:

data[data['name'] != ''][1:]
array([(2, 3.0, 'World'), (4, 5.0, 'year')], 
      dtype=[('value1', '<i4'), ('value2', '<f4'), ('name', 'S6')])

data[data['name'] != ''][1:][1][0]
4

print [x[1] for x in data[data['name'] != ''][1:]]
[3.0, 5.0]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM