Parsing Multiple CSV (.txt) Files in Python

Question

I am trying to set up a system where a user can log into the web interface and track orders that have been placed. The system will track the orders from their initial confirmation, through production and finally stop before shipping. (As my wife explained it: "Like the Domino's Pizza order tracker, but for business cards.") I am stuck at a point where I need to parse data from an ever-changing directory of comma delimited .txt files. Each order that is placed automatically generates it's own .txt file with all sorts of important information that I will display on the web interface. For example:

H39TZ3.txt:

token,tag,prodcode,qty          #(These are the headers)
,H39TZ3,pchd_4stpff,,100        #(These are the corresponding values for part 1 of the order)
,H39TZ3,pchdn_8ststts,6420-PCNM8ST,100   #(These are values for part 2 of the order)

There are going to be upwards of 300 different .txt files in the directory at any given time and the files will come and go based on their order status (once shipped, the files will be archived) I have read up on code to parse an individual file and import the values into a dictionary, but everything I've found is for a single file. How would I go about writing something like this, only for multiple files?

import csv

d = {}

for row in csv.reader(open('H39TZ3.txt')):
    d['Order %s' % row[1]] = {'tag': row[1], 'prodcode': row[2], 'qty': row[3]}

Thanks!

Answer 1

You can use os.listdir() to list the contents of the directory containing your .txt files. Something like the following should work for you:

for filename in os.listdir("."):
    with open(filename) as csv_file:
        for row in csv.reader(csv_file):
            d['Order %s' % row[1]] = {'tag': row[1], 'prodcode': row[2], 'qty': row[3]}

Note that I added a with statement in there. It will make sure to close the file after you finish processing it so you don't waste/run out of file descriptors. If the directory might contain other files besides those you are interested in, you could add appropriate filtering before the with statement.

Answer 2

I'd like to add that csv.DictReader is probably a better option if you want to read the rows as dictionaries. It will automatically set the keys of the dictionary based on the first row (headers). You'd use it like this:

with open(filename) as csv_file:
    for row in csv.DictReader(csv_file):
        d['Order ' + row['tag']] = row

As dm03514 mentions, though, a database will probably be a better option. sqlite comes with Python (the sqlite3 module), and you can use a variety of tools to inspect and modify the database. It should also be more robust than using individual files.

Parsing Multiple CSV (.txt) Files in Python

Question

2 answers

solution1
4 ACCPTED 2012-10-24 19:06:28

solution2
3 2012-10-24 19:09:28

Parsing Multiple CSV (.txt) Files in Python

Question

2 answers

solution1 4 ACCPTED 2012-10-24 19:06:28

solution2 3 2012-10-24 19:09:28

solution1
4 ACCPTED 2012-10-24 19:06:28

solution2
3 2012-10-24 19:09:28