简体   繁体   中英

Converting 2D Array into 3D Array Numpy Python 3.6

I'm trying to write a code that reads in multiple data files to analyse them, I have the following:

import numpy as np
import os
import glob

os.chdir('/Users/basilowen/Documents/Physics Degree/Fourth Year /Project/dielectric raw data/B22 full temp scan') 


data_files = glob.glob('**/*.TXT',recursive=True)

data_files.sort()

data = np.array([]).reshape(0,6)

for i in data_files:
    item=np.genfromtxt(i,skip_header=3)
    data=np.r_[data,item]

data=np.reshape(data,(37,6,11),order="F")

So basically what my code does is read about 11 .txt files (all files have exact same dimensions just different values) ignoring the first 3 lines of data then adds them to an empty array with the final dimension of the array (407,6) after the for loop has finished (each .txt file is a data set with dimensions (37,6) after ignoring first 3 lines of data, a temperature that I got my data with). I want to access the data for each .txt file (each temperature) separately so I thought I could re-shape array so it has dimensions (37,6,11) ie rearranging the indices without changing the data, I could then loop through the third index and make plots etc for each temperature (also would be useful later on to make note what specific temp one (37,6) data set it is and label it somehow)

Firstly when I reshape the array I'm unsure what the parameter order does - I have looked it up and on python docs it says "F" here means the first index changes fastest like Fortran and "C" would mean the last index changes fastest like C - what does that really mean, I can't find an example online that explains this clearly?

Secondly when I reshape the array I don't get what I wanted as it reshapes it and then I look at a particular entry of the final array and it's not what it should be.

Attached is a picture below of what I kind of mean I want for the final array:

在此处输入图片说明

Finally am I going about this task in the right way ie is there an easier way for what I want to do?

EDIT: below is of one the .txt files for a specific sample (there are 10 other .txt files within the same folder which represent the measurement data of the same sample at 10 other temperatures)

B22 polymer blend membrane full temp scan  Temp. [K]=2.5315e+02  AC Volt  [Vrms]=1.000e+00
Fixed value(s) :  Temp. [K] = 2.531e+02,   AC Volt  [Vrms] = 1.000e+00
 Freq. [Hz]  Eps'    Eps''   Sig' [S/cm]     Sig'' [S/cm]    |Sig| [S/cm]
 17779e+02   30662e-04   24080e-05   23817e-11  -20436e-10   20575e-10
 11545e+02   31479e-04   30542e-05   19616e-11  -13795e-10   13934e-10
 74968e+01   32179e-04   34625e-05   14441e-11  -92501e-11   93621e-11
 48680e+01   33029e-04   39643e-05   10736e-11  -62368e-11   63286e-11
 31611e+01   34038e-04   45633e-05   80250e-12  -42274e-11   43028e-11
 20526e+01   35202e-04   52863e-05   60365e-12  -28778e-11   29405e-11
 13329e+01   36526e-04   61647e-05   45713e-12  -19670e-11   20194e-11
 86551e+00   38046e-04   72474e-05   34896e-12  -13505e-11   13948e-11
 56202e+00   39796e-04   85994e-05   26887e-12  -93161e-12   96964e-12
 36495e+00   41817e-04   10314e-04   20940e-12  -64598e-12   67908e-12
 23698e+00   44167e-04   12518e-04   16504e-12  -45045e-12   47973e-12
 15388e+00   46917e-04   15395e-04   13180e-12  -31604e-12   34242e-12
 99924e-01   50170e-04   19208e-04   10678e-12  -22331e-12   24752e-12
 64886e-01   54075e-04   24318e-04   87782e-13  -15910e-12   18171e-12
 42133e-01   58865e-04   31215e-04   73168e-13  -11454e-12   13591e-12
 27359e-01   64869e-04   40525e-04   61681e-13  -83513e-13   10382e-12
 17766e-01   72393e-04   53038e-04   52421e-13  -61667e-13   80937e-13
 11536e-01   81764e-04   69974e-04   44908e-13  -46057e-13   64327e-13
 74911e-02   92874e-04   92921e-04   38725e-13  -34538e-13   51889e-13
 48643e-02   10589e-03   12505e-03   33841e-13  -25948e-13   42644e-13
 31587e-02   12049e-03   17113e-03   30071e-13  -19416e-13   35795e-13
 20511e-02   13611e-03   23865e-03   27232e-13  -14390e-13   30801e-13
 13319e-02   15243e-03   34008e-03   25199e-13  -10554e-13   27320e-13
 86485e-03   16916e-03   49441e-03   23788e-13  -76576e-14   24990e-13
 56159e-03   18608e-03   73042e-03   22820e-13  -55012e-14   23474e-13
 36467e-03   20401e-03   10937e-02   22189e-13  -39360e-14   22535e-13
 23680e-03   22460e-03   16526e-02   21771e-13  -28271e-14   21954e-13
 15377e-03   25244e-03   25129e-02   21497e-13  -20740e-14   21596e-13
 99848e-04   29716e-03   38317e-02   21284e-13  -15951e-14   21344e-13
 64836e-04   37803e-03   58596e-02   21135e-13  -13275e-14   21177e-13
 42101e-04   53963e-03   89635e-02   20994e-13  -12405e-14   21031e-13
 27339e-04   86936e-03   13687e-01   20817e-13  -13070e-14   20858e-13
 17752e-04   15210e-02   20849e-01   20591e-13  -14922e-14   20645e-13
 11527e-04   28215e-02   31642e-01   20291e-13  -18029e-14   20371e-13
 74854e-05   53790e-02   47764e-01   19890e-13  -22358e-14   20016e-13
 48606e-05   10146e-01   71407e-01   19309e-13  -27410e-14   19503e-13
 31563e-05   19010e-01   10584e+00   18584e-13  -33363e-14   18881e-13

(I'm unsure how to attach this as .txt file to this post)

You already know the desired shape of your data array before reading the files, therefore, it will be simpler to directly create it with its final shape:

import numpy as np
import os
import glob

os.chdir('/Users/basilowen/Documents/Physics Degree/Fourth Year /Project/dielectric raw data/B22 full temp scan') 


data_files = glob.glob('**/*.TXT',recursive=True)

data_files.sort()

data = np.empty((37,6,11))

for i,file in enumerate(data_files):
    data[:,:,i]=np.genfromtxt(file,skip_header=3)

In order to do it with the reshape command, you should do it with order='C' instead, and using 11 as the first dimension. Here there is a simple example trying to explain the difference in the order of the reshape. We start with a 1D array, which we will reshape to a 2D array:

import numpy as np
a = np.arange(12)
c = a.reshape((6, 2),order='C')
f = a.reshape((6, 2),order='F')

When using order='C' , the second index of the output changes first, thus, the following mapping will be applied:

  • a[0] -> c[0,0]
  • a[1] -> c[0,1]
  • a[2] -> c[1,0]
  • ...

whereas with order='F' the mapping is like this instead:

  • a[0] -> f[0,0]
  • a[1] -> f[1,0]
  • a[2] -> c[2,0]
  • ...

Therefore, the outputs will be:

>>> c
array([[ 0,  1],
       [ 2,  3],
       [ 4,  5],
       [ 6,  7],
       [ 8,  9],
       [10, 11]])
>>> f
array([[ 0,  6],
       [ 1,  7],
       [ 2,  8],
       [ 3,  9],
       [ 4, 10],
       [ 5, 11]])

Eventually, we can use this simple example to illustrate your situation, by reshaping c to a 3D array:

>>> arr_3d = c.reshape((3,2,2),order='C')
>>> arr_3d
array([[[ 0,  1],
        [ 2,  3]],

       [[ 4,  5],
        [ 6,  7]],

       [[ 8,  9],
        [10, 11]]])
>>> arr_3d[0,:,:]
array([[0, 1],
       [2, 3]])

It's better to collect the arrays in a list, and join them with one call at the end. It's more efficient and easier to do.

alist = []
# data = np.array([]).reshape(0,6)

for i in data_files:
    item=np.genfromtxt(i,skip_header=3)     # (37,6) array
    alist.append(item)
    # data=np.r_[data,item]
data = np.stack(alist, axis=2)  # to make (37,6,11)
# data=np.reshape(data,(37,6,11),order="F")

# np.stack(alist) # default axis=0
# np.array(alist) # both produce (11,37,6)

Worrying about order is premature. As created by genfromtxt they are order 'C'. numpy can access data along any of the 3 axes with equal ease. It's easiest to loop, if needed, on the first dimension.

After making the 3d array you can easily reshape, and/or transpose if one ordering proves to be more convenient. You'd have to do some time tests to verify whether one order or other is faster for the kinds of access you need. Get things working, and worry about speed later.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM