简体   繁体   中英

Taking each column as its own list

With a csv the goal is to create a list for each column in the csv, ignoring the first row, which is the header row.

 var_a        var_b
   a            1
   b            2
   c            3

listA = [var_a] = ['a','b','c']
listB = [var_b] = [1,2,3]

Right now, my only solution is to create an empty list and iterate over the csv position by position and append it to these empty lists.

If you have memory enough, you can get a bit more elegance:

with open('the.csv') as f:
    next(f)
    list_of_rows = list(csv.reader(f))

listA = [row[0] for row in list_of_rows]
listB = [int(row[1]) for row in list_of_rows]

but it's not enormously different from what you say you're doing now -- just a tad more elegant.

(In your example the second columns somehow gives a list of int s while the first one gives a list of str s -- there's no black magic to do that , either, so I explicitly used int where it appears needed).

Have you checked out the csv tools that ship with python? These can help shrink your code down.

Also, in terms of the complexity, iterating over each element is the best you can do. If it's easier, you can try loading everything into a matrix

both = [[a, 1], [b, 2], [c, 3]]

(which is what python's csv tools will naturally do for you), and the transposing

z = list (zip (*both))
listA = list (z[0]) # zip gives a tuple, make a list so you can edit
listB = list (z[1])

You can use dict reader and create a list by header:

import csv

result={}
with open(fn) as f:
    for line in csv.DictReader(f, delimiter='\t'):
        for k in line:
            result.setdefault(k, []).append(line[k].strip())

print result

Prints:

{'var_a': ['a', 'b', 'c'], 'var_b': ['1', '2', '3']}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM