简体   繁体   中英

How do I coerce a python list of dictionaries containing date strings into a numpy record array of datetime objects?

This question is about the construction of a numpy record array from a list of dictionaries (as in the other question), but specifically for the datetime dtype. The method outlined there did not work for me. Read on -

I have a Python 3 list of dictionaries, for example:

e = [{"date":"2019-11-07","value":3.147},{"date":"2019-11-08","value":2.7315}]

I want to convert e to a numpy record array/structure, and cast the date string to a datetime or np.datetime64 object in one fell swoop.

But the following is not working: the date field is either still a string - not any sort of datetime object - or throws ValueError Could not convert object to NumPy datetime .

import numpy as np
what_goes_here = 'datetime64[s]' # or 'M8[D]', or..?
e_type = np.dtype([('date', what_goes_here), ('value', float)])
i = np.array(e, dtype=e_type)

Is there a way to achieve all this in one step and if so how?

Please no Python 2 or Pandas.

This is the opposite transformation to Efficient way to convert numpy record array to a list of dictionary - plus the added datetime complication

Get the value and saved as a list of tuples before casting the dtype to datetime64 and np.float

import numpy as np
e = [{"date": "2019-11-07", "value": 3.147},
     {"date": "2019-11-08", "value": 2.7315}]
e = [(d["date"], d["value"]) for d in e]
e = np.rec.array(e, dtype=[('date', 'datetime64[s]'), ('value', np.float)])
print('result: ', e)
print('data type of date: ', type(e.date[0]))

# print result
result:  [('2019-11-07T00:00:00', 3.147 ) ('2019-11-08T00:00:00', 2.7315)]
data type of date:  <class 'numpy.datetime64'>

According to this answer , conversion of datetime objects in numpy changed in 1.11.

Using a portion of the question from that link, this should give what you're looking for (with datetime numpy type explicitly specified):

import numpy as np

e = [{"date":"2019-11-07","value":3.147},{"date":"2019-11-08","value":2.7315}]
e_type = np.dtype([('date', 'datetime64[us]'), ('value', np.float)])

records = np.array([(x['date'], x['value']) for x in e],
                   dtype=e_type)

# Produces
array([('2019-11-07T00:00:00.000000', 3.147 ),
       ('2019-11-08T00:00:00.000000', 2.7315)],
      dtype=[('date', '<M8[us]'), ('value', '<f8')])

>>> records[0][0]
numpy.datetime64('2019-11-07T00:00:00.000000')

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM