How to split the file content by space and end-of-line character?

Question

When I do the following list comprehension I end up with nested lists:

channel_values = [x for x in [ y.split(' ') for y in
    open(channel_output_file).readlines() ] if x and not x == '\n']

Basically I have a file composed of this:

7656 7653 7649 7646 7643 7640 7637 7634 7631 7627 7624 7621 7618 7615
8626 8623 8620 8617 8614 8610 8607 8604 8600 8597 8594 8597 8594 4444
<snip several thousand lines>

Where each line of this file is terminated by a new line.

Basically I need to add each number (they are all separated by a single space) into a list.

Is there a better way to do this via list comprehension?

Answer 1

您不需要为此的列表理解：

channel_values = open(channel_output_file).read().split()

Answer 2

Just do this:

channel_values = open(channel_output_file).read().split()

split() will split according to whitespace that includes ' ' '\\t' and '\\n' . It will split all the values into one list.

If you want integer values you can do:

channel_values = map(int, open(channel_output_file).read().split())

or with list comprehensions:

channel_values = [int(x) for x in open(channel_output_file).read().split()]

Answer 3

Also, the reason the original list comprehension had nested lists is because you added an extra level of list comprehension with the inner set of square brackets. You meant this:

channel_values = [x for x in y.split(' ') for y in
    open(channel_output_file) if x and not x == '\n']

The other answers are still better ways to write the code, but that was the cause of the problem.

Answer 4

Well another problem is that you're leaving the file open. Note that open is an alias for file .

try this:

f = file(channel_output_file)
channel_values = f.read().split()
f.close()

Note they'll be string values so if you want integer ones change the second line to

channel_values = [int(x) for x in f.read().split()]

int(x) will throw a ValueError if you have a non integer value in the file.

Answer 5

Is there a better way to do this via list comprehension?

Sort of..

Instead of reading each line as an array, with the .readlines() methods, you can just use .read() :

channel_values = [x for x in open(channel_output_file).readlines().split(' ')
if x not in [' ', '\n']]

If you need to do anything more complicated, particularly if it involves multiple list-comprehensions, you're almost always better of expanding it into a regular for loop.

out = []
for y in open(channel_output_file).readlines():
    for x in y.split(' '):
        if x not in [' ', '\n']:
            out.append(x)

Or using a for loop and a list-comprehension:

out = []
for y in open(channel_output_file).readlines():
    out.extend(
        [x for x in y.split(' ')
        if x != ' ' and x != '\n'])

Basically, if you can't do something simply with a list comprehension (or need to nest them), list-comprehensions are probably not the best solution.

Answer 6

If you don't care about dangling file references, and you really must have a list read into memory all at once, the one-liner mentioned in other answers does work:

channel_values = open(channel_output_path).read().split()

In production code, I would probably use a generator, why read all those lines if you don't need them?

def generate_values_for_filename(filename):
    with open(filename) as f:
        for line in f:
            for value in line.split():
                yield value

You can always make a list later if you really need to do something other than iterate over values:

channel_values = list(generate_values_for_filename(channel_output_path))

How to split the file content by space and end-of-line character?

Question

6 answers

solution1
16 ACCPTED 2009-11-12 17:49:51

solution2
6 2009-11-12 17:50:20

solution3
2 2009-11-12 18:03:37

solution4
0 2009-11-12 17:56:12

solution5
0 2009-11-12 18:12:48

solution6
0 2009-11-13 02:38:25

How to split the file content by space and end-of-line character?

Question

6 answers

solution1 16 ACCPTED 2009-11-12 17:49:51

solution2 6 2009-11-12 17:50:20

solution3 2 2009-11-12 18:03:37

solution4 0 2009-11-12 17:56:12

solution5 0 2009-11-12 18:12:48

solution6 0 2009-11-13 02:38:25

solution1
16 ACCPTED 2009-11-12 17:49:51

solution2
6 2009-11-12 17:50:20

solution3
2 2009-11-12 18:03:37

solution4
0 2009-11-12 17:56:12

solution5
0 2009-11-12 18:12:48

solution6
0 2009-11-13 02:38:25