简体   繁体   中英

Efficiently splitting a column of numpy array

I have a csv file with 5 columns, in which the second column is time represented in format 10/22/2001 14:00 . I want to create another file with this time data split into separate columns. To split the column I used the below code in python

from numpy import loadtxt
import numpy as np
from time import strptime

filename = 'data/file.csv'
data = loadtxt(filename, delimiter=',', dtype=str, skiprows=1)
newdata = np.zeros((data.shape[0],7))
newdata[:,0] = data[:,0]

for i in range(len(data[:,1])):
    tm =  strptime(data[i,1], "%m/%d/%Y %H:%M")
    newdata[i,1] = tm.tm_year
    newdata[i,2] = tm.tm_wday
    newdata[i,3] = tm.tm_hour

newdata[:,4:] =  data[:,2:]

Is there a better way of doing this using numpy methods or other modules of python?

You can shorten the generation of newdata using the following three lines:

  1. Converts the datetime strings into datetime objects:

     datetimes = [datetime.strptime(d, "%m/%d/%Y %H:%M") for d in data[:, 1]] 

    I assume you use from datetime import datetime .

  2. Collect year, weekday and hour of each datetime object.

     yearWeekdayHour = [[dt.year, dt.weekday(), dt.hour] for dt in datetimes] 
  3. Horizontally stack all parts together: The first column of the original data , the date and time information as well as the last columns of data .

     newdata = np.hstack((data[:, 0, None], yearWeekdayHour, data[:, 2:])) 

    Note that the first column is indexed with an additional None to get a 2D array, which is required for horizontal stacking.


I'm not quite sure if the is a better solution. Yours might be more readable, especially for those not perfectly familiar with all those Python list comprehensions. But it might be an alternative worth reading and playing around with. It can be a quite powerful tool.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM