is there any better way for reading files?

Question

Every time when i am reading CSv file as list by using this long method, can we simplify this?

Creating empty List
Reading file row-wise and appending to the list

filename = 'mtms_excelExtraction_m_Model_Definition.csv'
Ana_Type = []
Ana_Length = []
Ana_Text = []
Ana_Space = []                                                                                                                                                                                                                                                                     
with open(filename, 'rt') as f:  
    reader = csv.reader(f)   
    try:
        for row in reader:
            Ana_Type.append(row[0])
            Ana_Length.append(row[1])
            Ana_Text.append(row[2])
            Ana_Space.append(row[3])            
    except csv.Error as e:
        sys.exit('file %s, line %d: %s' % (filename, reader.line_num, e))

Answer 1

This is a good opportunity for you to start using pandas and working with DataFrames.

import pandas as pd

df = pd.read_csv(path_to_csv)

1-2 (depending on if you count the import) lines of code and you're done!

Answer 2

This one is essentially the numpy way of processing the csv file, without using numpy. Whether it is better than your original method is close to a matter of taste. It has in common with the numpy or Pandas method the fact of loading the whole file in memory and than transposing it into lists:

with open(filename, 'rt') as f:  
    reader = csv.reader(f)   
    tmp = list(reader)
Ana_Type, Ana_Length, Ana_Text, Ana_Space = [[tmp[i][j] for i in range(len(tmp))]
                                             for j in range(len(tmp[0]))]

It uses less code, and build arrays with comprehensions instead of repeated appends, but more memory (as would numpy or pandas).

Depending on how you later process the data, numpy or Pandas could be a nice option. Because IMHO using them only to load a csv file into list is not worth it.

Answer 3

You can use a DictReader

import csv

with open(filename, 'rt') as f:  
    data = list(csv.DictReader(f, fieldnames=["Type", "Length", "Text", "Space"]))

print(data)

This will give you a single list of dict objects, one per row.

Answer 4

This could be useful:

import numpy as np
# read the rows with Numpy
rows = np.genfromtxt('data.csv',dtype='str',delimiter=';')
# call numpy.transpose to convert the rows to columns
cols = np.transpose(rows)

# get the stuff as lists
Ana_Type = list(cols[0])
Ana_Length = list(cols[1])
Ana_Text = list(cols[2])
Ana_Space = list(cols[0])

Edit: note that the first element will be the name of the columns (example with test data):

['Date', '2020-03-03', '2020-03-04', '2020-03-05', '2020-03-06']

Answer 5

Try this

import csv
from collections import defaultdict
d = defaultdict(list)
with open(filename, mode='r') as csv_file:
    csv_reader = csv.DictReader(csv_file)
    for row in csv_reader:
        for k,v in row.items():
            d[k].append(v)

next

d.keys()
dict_keys(['Ana_Type', 'Ana_Length', 'Ana_Text', 'Ana_Space'])

next

d.get('Ana_Type')
['bla','bla1','df','ccc']

Answer 6

The repetitive calls to list.append can be avoided by reading the csv and using the zip builtin function to transpose the rows.

import io, csv

# Create an example file
buf = io.StringIO('type1,length1,text1,space1\ntype2,length2,text2,space2\ntype3,length3,text3,space3')

reader = csv.reader(buf)
# Uncomment the next line if there is a header row
# next(reader)

Ana_Types, Ana_Length, Ana_Text, Ana_Space = zip(*reader)

print(Ana_Types)
('type1', 'type2', 'type3')
print(Ana_Length)
('length1', 'length2', 'length3')
...

If you need lists rather than tuples you can use a list or generator comprehension to convert them:

Ana_Types, Ana_Length, Ana_Text, Ana_Space = [list(x) for x in zip(*reader)]

is there any better way for reading files?

Question

6 answers

solution1
2 ACCPTED 2020-07-23 14:37:02

solution2
2 2020-07-23 14:52:05

solution3
1 2020-07-23 14:32:56

solution4
1 2020-07-23 14:35:41

solution5
1 2020-07-23 15:28:18

solution6
1 2020-07-23 15:29:12

is there any better way for reading files?

Question

6 answers

solution1 2 ACCPTED 2020-07-23 14:37:02

solution2 2 2020-07-23 14:52:05

solution3 1 2020-07-23 14:32:56

solution4 1 2020-07-23 14:35:41

solution5 1 2020-07-23 15:28:18

solution6 1 2020-07-23 15:29:12

solution1
2 ACCPTED 2020-07-23 14:37:02

solution2
2 2020-07-23 14:52:05

solution3
1 2020-07-23 14:32:56

solution4
1 2020-07-23 14:35:41

solution5
1 2020-07-23 15:28:18

solution6
1 2020-07-23 15:29:12