I have a set of data that I would like to treat with numpy. The data can be looked at as a set of points in space with an additional property variable that I would like to handle as an object. Depending on a set of data, the vectors may be of length 1, 2, or 3, but is the same length for all points in a given set of data. The property object is a custom class that may be the same for any two given points.
So consider this data as a random example (C and H represent objects that contain atomic properties for Carbon or Hydrogen ... or just some random object). These will not be read in through a file, but created by an algorithm. Here the C object may be the same or it may be different (isotope for example).
Example 3D data set (just abstract representation)
C 1 2 3
C 3 4 5
H 1 1 4
I would like to have a numpy array that contains all of the atomic positions, so that I can perform numpy operations like vector manipulation and such as a translation function def translate(data,vec):return data + vec
. I would also like to handle the property objects in parallel. One option would be to have two separate arrays for both, but if I delete an element of one, I would have to explicitly delete the property array value as well. This could get difficult to handle.
I considered using numpy.recarray
x = np.array([(1.0,2,3, "C"), (3.0,2,3, "H")], dtype=[('x', "float64" ),('y',"float6
4"),('z',"float64"), ('type', object)])
But it seems the shape
of this array is (2,)
, which means that each record is handled independently. Also, I cannot seem to understand how to get vector manipulation to work with this type:
def translate(data,vec):return data + vec
translate(x,np.array([1,2,3]))
...
TypeError: unsupported operand type(s) for +: 'numpy.ndarray' and 'numpy.ndarray'
Is numpy.recarray
what I should be using? Is there a better way to handle this in a simpler way such that I have a separate numerical matrix of points with a parallel object
array that are linked in case an element is removed ( np.delete
)? I also briefly considered writing an array object that extends ndarray
, but I feel like this may be unnecessary and potentially disastrous.
Any thoughts or suggestions would be very helpful.
The field of a recarray can be a ndarray, if you pass the tuple (name, type, shape)
as the dtype of the field:
In [9]:
import numpy as np
x = np.array([((1.0,2,3), "C"), ((3.0,2,3), "H")], dtype=[('xyz', "float64", (3,)), ('type', object)])
In [11]:
np.delete(x, 0)
Out[11]:
array([([3.0, 2.0, 3.0], 'H')],
dtype=[('xyz', '<f8', (3,)), ('type', 'O')])
In [12]:
x["xyz"]
Out[12]:
array([[ 1., 2., 3.],
[ 3., 2., 3.]])
In [14]:
x["xyz"] + (10, 20, 30)
Out[14]:
array([[ 11., 22., 33.],
[ 13., 22., 33.]])
For your translate function:
def translate(data,vec):
tmp = data.copy()
tmp["xyz"] += vect
return tmp
If you want more flexible functions, you may consider using Pandas.DataFrame
.
If you are dealing with collections of atoms, you may consider to use the Atoms class from Atomic Simulation Environment (ASE) . It stores atom types, positions and has list-like methods to manipulate them.
One quick and dirty way would be to set the last (or indeed any) column to be a numerical lookup to a labels dictionary:
>>> import numpy
>>> labels = ['H', 'C', 'O']
>>> labels_refs = dict(zip(labels, numpy.arange(len(labels), dtype='float64')))
>>> reverse_labels_refs = dict(zip(numpy.arange(len(labels), dtype='float64'), labels))
>>> x = numpy.array([
... [1.0,2,3, labels_refs['C']],
... [3.0,2,3, labels_refs['H']],
... [2.0,2,3, labels_refs['C']]])
>>> x
array([[ 1., 2., 3., 1.],
[ 3., 2., 3., 0.],
[ 2., 2., 3., 1.]])
>>> extract_refs = numpy.vectorize(
... lambda label_ref: reverse_labels_refs[label_ref])
>>> labels = extract_refs(x[:, -1]) # Turn the last column back into labels
>>> labels
array(['C', 'H', 'C'],
dtype='|S8')
You can also lookup rows by their labels (as an example):
>>> x[numpy.where(x[:,-1] == labels_refs['C']), :-1]
array([[[ 1., 2., 3.],
[ 2., 2., 3.]]])
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.