简体   繁体   中英

IndexError: too many indices for array, datetime strings and numpy genfromtxt

I am having problems with correctly calling data from .txt files.

A sample of my data looks like so:

file1.txt

Time: ID: W: X: Y: Z:
2016/02/25:19:08:41 006124189X 769 372 363 348
2016/02/25:21:41:13 006124189X 769 362 308 390
2016/02/25:22:38:20 006124189X 769 362 363 390
2016/02/26:07:37:42 006124189X 769 372 272 366
2016/02/26:08:54:34 006124189X 769 372 272 366
2016/02/26:09:57:04 006124189X 769 372 363 371

Where the first column is a datetime string, the second is an id consisting of numbers and letters, the others are just integers ranging from 0-10000.

I will eventually try to plot some of these integer values against the time value recorded, but currently I am just trying to get the data to be called correctly. My current code setup:

import numpy as np
import matplotlib.pyplot as plt
import pylab
import datetime

#File name for data input.
datafile = 'file1.txt'

#Names to be used for column headers.
names = ['Time', 'ID', 'W, 'X', 'Y', 'Z']

#Read Data from file into array. Skipping the first line. 
#Datatypes used, object for Time, String for ID and Integer for the rest.
data = np.genfromtxt(datafile, skip_header=1, dtype="Object,S11,i8,i8,i8,i8", names = ['Time', 'ID', 'W', 'X', 'Y', 'Z'])

#Print the data called to check it works.
print data

#Designating each column to a name.
Time = data[:,0]
ID = data[:,1]
W = data[:,2]
X = data[:,3]
Y = data[:,4]
Z = data[:,5]

#Print designated column.
print Time

I've tried to be as conclusive as possible in what I'm trying to do.

Eventually I want to include a plot using matplotlib adding something like so to the end:

plt.plot(Time,W, label='W vs Time')
plt.xlabel('Time',fontsize=12)
plt.ylabel('W',fontsize=12) 
plt.show()

However, when the script is run in its current form it gives the error:

line 15, in <module>
Time = data[:,0]
IndexError: too many indices for array

This error is the same for each respective column ie

line 16, in <module>
W = data[:,2]
IndexError: too many indices for array

The print Data line before, will correctly output all the data in the file, showing each time as a string like so '2016/02/25:19:08:32' including the quotes.

I am unsure how to correctly handle the data form here. If I just set dtype =i8 then I can call any of the data columns fine except the Time and ID column which will recall -1 values for all rows, understandably.

I have tried following this scipy doc , also tried this stack page of a similar thing which I couldn't get to work.

Any help is appreciated.

data is a structured array. Check its shape and dtype . It has named fields instead of columns.

ID = data['ISBN']

Should work instead of data[:,1] .

Or

Time = data[names[0]]
ID = data[names[1]]
...

Something is wrong with the genfromtxt documentation. It needs to stress that if using names the result will be a structured array with a compound dtype , and that users need to access the data accordingly.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM