简体   繁体   中英

Read text file only for certain rows to split up file in Python

I am loading a text file using np.loadtxt and would like to have python split it up in four. Usually I would just copy paste each set of data into different text files and do np.loadtxt for each text file, but I'm going to have to do this hundreds of times so it would be way too time consuming.

Here is a shortened version of the text file. So what I'd like to do is to have python read the first number (0.6999) and discard it, then read the 5 following rows of values and assign variable names to each column, and then the next 5 rows with variables to each column again, and so on.

Is there any way I could tell python to maybe do np.loadtext only for row 1, then only for row 2 to 6, then 7 to 12 etc?

   0.699999988
   1    0.2000    0.0618
   2    0.2500    0.0417
   3    0.3000    0.0371
   4    0.3500    0.0390
   5    0.4500    0.0761
    670.0000  169.4000 6.708E-09
    635.0001  169.1806 1.584E-08
    612.9515  168.6255 2.724E-08
    591.2781  168.2719 4.647E-08
  670.00  0.0E+00  0.0E+00  0.0E+00  0.0E+00  0.0E+00  0.0E+00  0.0E+00
  635.00  9.8E-07  4.2E-07  2.1E-07  1.2E-07  4.4E-08  1.8E-08  1.4E-08
  612.95  6.0E-06  3.5E-06  2.1E-06  1.3E-06  4.7E-07  1.8E-07  1.4E-07
  591.28  2.2E-05  1.3E-05  7.7E-06  4.9E-06  1.8E-06  6.6E-07  5.0E-07
  569.98  8.3E-05  5.0E-05  2.8E-05  1.8E-05  6.4E-06  2.4E-06  1.8E-06
  549.06  3.0E-04  1.8E-04  1.0E-04  6.2E-05  2.3E-05  8.4E-06  6.4E-06
  528.51  7.8E-04  5.0E-04  2.8E-04  1.7E-04  6.2E-05  2.3E-05  1.8E-05
  508.34  1.6E-03  1.0E-03  5.8E-04  3.4E-04  1.3E-04  4.9E-05  3.7E-05

Here is what I was using for my three different text files:

altvall,T,Pp= np.loadtxt('file1.txt',usecols = (0,1,2),unpack=True) # load text file

tau1,tau2,tau3,tau4,tau5,tau6,tau7 = np.loadtxt('file2.txt',usecols = (1,2,3,4,5,6,7),unpack=True) # load text file

wvln,alb = np.loadtxt('file3.txt',usecols = (1,2),unpack=True) # load text file

Now I just want something similar but without splitting my text file into different parts.

A simple way is to use itertools.izip_longest to group the rows of your input file into groups of 5. The key is to do the following:

for rows in izip_longest(*[file_object]*N):
    # rows will be a tuple of N consecutive rows
    # do something with rows

Full example:

import numpy as np
from itertools import izip_longest

data = []
with open(filehandle, 'r') as fin:
    fin.next() # skip first line
    for rows in izip_longest(*[fin]*5): # read fin 5 rows at a time
        rows = [map(float, r.strip().split()) for r in rows]
        data.append(np.array(rows))

This yields a list of 5xN arrays:

>>> print data
[array([[ 1.    ,  0.2   ,  0.0618],
       [ 2.    ,  0.25  ,  0.0417],
       [ 3.    ,  0.3   ,  0.0371],
       [ 4.    ,  0.35  ,  0.039 ],
       [ 5.    ,  0.45  ,  0.0761]]),
 array([[  6.70000000e+02,   1.69400000e+02,   6.70800000e-09],
       [  6.35000100e+02,   1.69180600e+02,   1.58400000e-08],
       [  6.12951500e+02,   1.68625500e+02,   2.72400000e-08],
       [  5.91278100e+02,   1.68271900e+02,   4.64700000e-08],
       [  5.69980100e+02,   1.68055300e+02,   7.85900000e-08]]),
 array([[  6.70000000e+02,   0.00000000e+00,   0.00000000e+00,
          0.00000000e+00,   0.00000000e+00,   0.00000000e+00,
          0.00000000e+00,   0.00000000e+00],
       [  6.35000000e+02,   9.80000000e-07,   4.20000000e-07,
          2.10000000e-07,   1.20000000e-07,   4.40000000e-08,
          1.80000000e-08,   1.40000000e-08],
       [  6.12950000e+02,   6.00000000e-06,   3.50000000e-06,
          2.10000000e-06,   1.30000000e-06,   4.70000000e-07,
          1.80000000e-07,   1.40000000e-07],
       [  5.91280000e+02,   2.20000000e-05,   1.30000000e-05,
          7.70000000e-06,   4.90000000e-06,   1.80000000e-06,
          6.60000000e-07,   5.00000000e-07],
       [  5.69980000e+02,   8.30000000e-05,   5.00000000e-05,
          2.80000000e-05,   1.80000000e-05,   6.40000000e-06,
          2.40000000e-06,   1.80000000e-06]])]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM