How do I load specific rows from a .txt file in Python?

Question

Say I have a .txt file with many rows and columns of data and a list containing integer values. How would I load the row numbers in the text file which match the integers in the list?

To illustrate, say I have a list of integers:

a = [1,3,5]

How would I read only rows 1,3 and 5 from a text file into an array?

The loadtxt routine in numpy let's you both skip rows and use particular columns. But I can't seem to find a way to do something along the lines of (ignoring incorrect syntax):

new_array = np.loadtxt('data.txt', userows=a, unpack='true')

Thank you.

Answer 1

Given this file:

1,2,3
4,5,6
7,8,9
10,11,12
13,14,15
16,17,18
19,20,21

You can use the csv module to get the desired np array:

import csv
import numpy as np

desired=[1,3,5]
with open('/tmp/test.csv', 'r') as fin:
    reader=csv.reader(fin)
    result=[[int(s) for s in row] for i,row in enumerate(reader) if i in desired]

print(np.array(result))

Prints:

[[ 4  5  6]
 [10 11 12]
 [16 17 18]]

Answer 2

Just to expand on my comment

$ cat file.txt
line 0
line 1
line 2
line 3
line 4
line 5
line 6
line 7
line 8
line 9
line 10

Python:

#!/usr/bin/env python

a = [1, 4, 8]

with open('file.txt') as fd:
    for n, line in enumerate(fd):
        if n in a:
            print line.strip()

output:

$ ./l.py 
line 1
line 4
line 8

Answer 3

You can stick to using numpy's loadtxt method, except that you'll need to pass a generator object to the function instead of the file path.

First define a generator that accepts filename and row indices and yields only those lines at the specified indices

def generate_specific_rows(filePath, userows=[]):
    with open(filePath) as f:
        for i, line in enumerate(f):
            if i in userows:
                yield line

Now you can pass create a generator object and pass it to the loadtxt method

a = [1,3,5]
gen = generate_specific_rows('data.txt', userows=a)
new_array = np.loadtxt(gen, unpack='true')

Answer 4

Use CSV module and Files.xreadlines() .

CSV module : implements classes to read and write tabular data in CSV format

Files.xreadlines() : Return an iterator over the keys of the dictionary. This is a shortcut for iterkeys(). Deprecated since version 2.3: Use for line in file instead.

Answer 5

I would suggest to use line.split () instead of line.strip() . line.split () returns the list, which can be easily converted to numpy.array by using np.asarray command.

How do I load specific rows from a .txt file in Python?

Question

5 answers

solution1
5 ACCPTED 2013-09-24 23:58:57

solution2
3 2013-09-24 21:55:16

solution3
1 2020-04-15 16:36:08

solution4
0 2013-09-24 22:53:41

solution5
0 2017-05-26 10:34:39

How do I load specific rows from a .txt file in Python?

Question

5 answers

solution1 5 ACCPTED 2013-09-24 23:58:57

solution2 3 2013-09-24 21:55:16

solution3 1 2020-04-15 16:36:08

solution4 0 2013-09-24 22:53:41

solution5 0 2017-05-26 10:34:39

solution1
5 ACCPTED 2013-09-24 23:58:57

solution2
3 2013-09-24 21:55:16

solution3
1 2020-04-15 16:36:08

solution4
0 2013-09-24 22:53:41

solution5
0 2017-05-26 10:34:39