简体   繁体   中英

How to split a string and update the dictionary in csv file in python?

So I have a csv file with stock data inside it in the format:

Date,"Open","High","Low"

2012-11-14,660.66,662.18,123.4

I have successfully converted all the relevant data to the correct variable type, ie all Open values are floats, High are floats, date is string

This is my code so far:

    types = [ ("Date", str), ("Open",float), ("High", float),
      ("Low", float), ("Close", float), ("Volume", int), ("Adj Close", float) ]

    with open("googlePrices.csv") as f:
        for row in csv.DictReader(f):  # read a row as {col1: val1, col2: val2..}
            row.update((key, conversion(row[key])) for key, conversion in types)

how to I strip every date value so that there are no '-' in the date values? And then convert them to integers? I tried to use datetime but I can't really understand it.

Eliminating - s and converting the resulting strings to integers probably won't help you. You will absolutely want to use DateTime , more specifically strptime:

classmethod datetime. strptime (date_string, format)

Return a datetime corresponding to date_string, parsed according to format. This is equivalent to datetime(*(time.strptime(date_string, format)[0:6])). ValueError is raised if the date_string and format can't be parsed by time.strptime() or if it returns a value which isn't a time tuple. For a complete list of formatting directives, see section strftime() and strptime() Behavior.

eg:

datetime.datetime.strptime('2012-11-14','%Y-%m-%d')
#datetime.datetime(2012, 11, 14, 0, 0)

Also, you seem to have a financial time series. There is no need to read the CSV and parse it manually. Pandas does exactly what you need very well.

since data are saved in a csv file, after read, they are just string, if the format of Date is fixed, then just simple remove the - .

types = [ ("Date", int), ("Open",float), ("High", float),
      ("Low", float), ("Close", float), ("Volume", int), ("Adj Close", float) ]

rowlist = []

with open("googlePrices.csv") as f:
    for row in csv.DictReader(f):
        row['Date'] = row['Date'].replace('-','')
        try:
            row.update((key, conversion(row[key])) for key, conversion in types)
        except KeyError:
            continue 
        rowlist.append(row)

output:

>>> print rowlist
[{'Date': 20121114, 'High': 662.18, 'Open': 660.66, 'Low': 123.4}]

if you want convert Date to timestamp , use this:

>>>time.mktime(time.strptime('2012-11-14', '%Y-%m-%d'))
1352822400.0

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM