简体   繁体   English

如何在python中的csv文件中拆分字符串并更新字典?

[英]How to split a string and update the dictionary in csv file in python?

So I have a csv file with stock data inside it in the format: 所以我有一个CSV文件,里面带有库存数据,格式为:

Date,"Open","High","Low" 日期,“打开”,“高”,“低”

2012-11-14,660.66,662.18,123.4 2012-11-14,660.66,662.18,123.4

I have successfully converted all the relevant data to the correct variable type, ie all Open values are floats, High are floats, date is string 我已成功将所有相关数据转换为正确的变量类型,即所有Open值均为浮点数,High为浮点数,date为字符串

This is my code so far: 到目前为止,这是我的代码:

    types = [ ("Date", str), ("Open",float), ("High", float),
      ("Low", float), ("Close", float), ("Volume", int), ("Adj Close", float) ]

    with open("googlePrices.csv") as f:
        for row in csv.DictReader(f):  # read a row as {col1: val1, col2: val2..}
            row.update((key, conversion(row[key])) for key, conversion in types)

how to I strip every date value so that there are no '-' in the date values? 如何删除每个日期值,以便在日期值中没有“-”? And then convert them to integers? 然后将它们转换为整数? I tried to use datetime but I can't really understand it. 我尝试使用日期时间,但我无法真正理解它。

Eliminating - s and converting the resulting strings to integers probably won't help you. 消除- S和转换生成的字符串为整数可能不会帮助你。 You will absolutely want to use DateTime , more specifically strptime: 您将绝对要使用DateTime ,更具体地说是strptime:

classmethod datetime. 类方法datetime。 strptime (date_string, format) strptime (date_string,格式)

Return a datetime corresponding to date_string, parsed according to format. 返回对应于date_string的datetime,并根据格式进行解析。 This is equivalent to datetime(*(time.strptime(date_string, format)[0:6])). 这等效于datetime(*(time.strptime(date_string,format)[0:6]))。 ValueError is raised if the date_string and format can't be parsed by time.strptime() or if it returns a value which isn't a time tuple. 如果无法通过time.strptime()解析date_string和format或返回的值不是时间元组,则会引发ValueError。 For a complete list of formatting directives, see section strftime() and strptime() Behavior. 有关格式设置指令的完整列表,请参见strftime()和strptime()行为部分。

eg: 例如:

datetime.datetime.strptime('2012-11-14','%Y-%m-%d')
#datetime.datetime(2012, 11, 14, 0, 0)

Also, you seem to have a financial time series. 另外,您似乎有一个财务时间序列。 There is no need to read the CSV and parse it manually. 无需读取CSV并进行手动解析。 Pandas does exactly what you need very well. 熊猫恰恰可以很好地满足您的需求。

since data are saved in a csv file, after read, they are just string, if the format of Date is fixed, then just simple remove the - . 由于数据保存在csv文件中,因此读取后它们只是字符串,如果Date的格式是固定的,则只需删除-

types = [ ("Date", int), ("Open",float), ("High", float),
      ("Low", float), ("Close", float), ("Volume", int), ("Adj Close", float) ]

rowlist = []

with open("googlePrices.csv") as f:
    for row in csv.DictReader(f):
        row['Date'] = row['Date'].replace('-','')
        try:
            row.update((key, conversion(row[key])) for key, conversion in types)
        except KeyError:
            continue 
        rowlist.append(row)

output: 输出:

>>> print rowlist
[{'Date': 20121114, 'High': 662.18, 'Open': 660.66, 'Low': 123.4}]

if you want convert Date to timestamp , use this: 如果要将Date转换为timestamp ,请使用以下命令:

>>>time.mktime(time.strptime('2012-11-14', '%Y-%m-%d'))
1352822400.0

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM