简体   繁体   中英

Reading a text file with dates into a list of dictionaries in Python

I would like to read the following text file:

date        candy
1/12/2011   300
1/20/2010   200
1/16/2010   200

into a list of dictionaries as follows:

candysales= [ {'date': d(2011,1,12), 'sales': 300}, {'date': d(2010,1,20), 'sales': 200},{'date': d(2010,1,16), 'sales': 200}]

Does anyone have any ideas of how to begin doing this, or any resources that I can look at?

You can use csv.DictReader which will read a CSV file, using the first row as the dictionary key names, and parsing each row into a dictionary (you will lose field order in this case, as dictionaries are not reliably ordered). You can then convert the date from a string to a datetime.date object using datetime.datetime 's strptime method , and the converting to a date :

candysales = []
for row in csv.DictReader(file('/path/to/sales.csv')):
    row['date'] = datetime.strptime(row['date'], '%d/%m/%Y').date()
    candysales.append(row)

Edit: I've just noticed that the input isn't CSV (it looks like a fixed-width format). The csv module works with CSV files or tab-delimited files, but probably won't work well with this fixed-width format. If you can control the format of this file, CSV would be a good choice: if not, we can convert it using the re module:

def csvify(iterable):
    for line in utterable:
        yield re.sub('\s+', ',', line.rstrip())

candysales = []
for row in csv.DictReader(csvify(file('/path/to/sales.csv'))):
    row['date'] = datetime.strptime(row['date'], '%d/%m/%Y').date()
    candysales.append(row)

The csvify function returns a generator which is passed to the csv.DictReader , which yields the lines from the underlying file by first replacing occurrences of one or more whitespace characters with a single comma, thus converting to CSV.

This probably won't serve as a general-purpose solution to converting fixed-width text formats to CSV, but it will work if the example you've given above is representative.

You can read the entire file in a string

data = fin.read()

Split based on lines

data=data.splitlines()

Use List comprehension like

[dict((('date',datetime.datetime.strptime(k,"%m/%d/%Y")),('sales',v))) 
   for (k,v) in [e.split() for e  in data.splitlines()[1:]]]

which will give you a result like

[{'date': datetime.datetime(2011, 1, 12, 0, 0), 'sales': '300'}, {'date': datetime.datetime(2010, 1, 20, 0, 0), 'sales': '200'}, {'date': datetime.datetime(2010, 1, 16, 0, 0), 'sales': '200'}]

In case reading the entire file in memory is an issue for you, you can do the following

>>> candysales=[]
>>> fin.readline() # To Skip the First Line
for d in fin:
    k,v=d.split()
    candysales+=[dict((('date',datetime.datetime.strptime(k,"%m/%d/%Y")),('sales',v)))]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM