简体   繁体   English

将日期重新格式化为YYYY / MM / DD并替换CSV中的列数据

[英]reformat date to YYYY/MM/DD and replace column data in CSV

I'm a newbie python programmer and I have a CSV that is simple, looks like this: 我是一名新手python程序员,我有一个简单的CSV,看起来像这样:

FROM_ID,JOIN_DATE,FAV_SPORT
100004,06/08/2016,Football
100006,06/08/2016,Tennis
100007,06/08/2016,Football
100009,06/08/2016,Basketball

I am trying to rewrite the date to YYYY/MM/DD. 我正在尝试将日期重写为YYYY / MM / DD。 So far I have gotten this far: 到目前为止,我已经做到了:

import csv
f = open('reg.csv')
csv_f = csv.reader(f)

for row in csv_f:
    parts = row[1].split('/')
print parts[2]

All this does is print out the year (YYYY), which is one step closer :) Can anyone advise how to reformat the parts into the YYYY/MM/DD format? 所有这些操作都将打印出年份(YYYY),这一步更近了:)有人可以建议如何将零件重新格式化为YYYY / MM / DD格式吗?

Also, I notice python doesnt have case/select. 此外,我注意到python没有大小写/选择。 How would I create a find/replace on "FROM_ID" and replace them with another number? 如何在“ FROM_ID”上创建查找/替换并将其替换为另一个数字? Like: 喜欢:

if FROM_ID is equal to X then Y 如果FROM_ID等于X,则Y

Thanks in advance for any help. 在此先感谢您的帮助。 I have scoured the internet for hours and I am a touch stuck but hoping I can get moving along. 我已经在互联网上搜寻了几个小时,但我有些困惑,但希望我能继续前进。 Thanks! 谢谢!

You're pretty close. 你很亲密 All you need is: 所有你需要的是:

print parts[2] + "/" + parts[0] + "/" + parts[1]
import csv
f = open('reg.csv')
csv_f = csv.reader(f)

for row in csv_f:
    if "JOIN_DATE" in row: continue
    parts = row[1].split('/')
    data = "{}/{}/{}".format(parts[2],parts[1],parts[0])
print data

As to your second point, you should probably have a look at pandas , which is a Python library for data analysis, especially of tabular data sets. 关于第二点,您可能应该看看pandas ,这是一个用于数据分析的Python库,尤其是表格数据集。

You could read in your data using 您可以使用

df = pd.read_csv("path_to_your_file")

which would return a DataFrame on which you can do operations such as selecting subsets, your example would become 它将返回一个DataFrame ,您可以在该DataFrame上执行诸如选择子集之类的操作,您的示例将变为

df[df.FROM_ID == X] 

Try this one: 试试这个:

import datetime
.
.
datetime.datetime.strptime(row[1],'%d/%m/%Y').strftime('%Y/%m/%d')

I reccommend using pandas: 我建议使用熊猫:

import pandas as pd

df = pd.read_csv("path_to_your_file")

def change_date_format(date):

    # create a list of the substrings separated by '/',
    # so in your case ['DD', 'MM', 'YYYY'] 
    split_dt = date.split("/")

    return split_dt[2] + '/' + split_dt[1] + '/' + split_dt[0]   

#apply this function to all elements of 'JOIN_DATE' columns
df.loc[:, 'JOIN_DATE'] = df.loc[:, 'JOIN_DATE'].apply(change_date_format)

On your second question, you can do: 关于第二个问题,您可以执行以下操作:

to_replace = ['X', 'Z']
replace_values = ['Y', 'W']
replace_dict = dict(zip(to_replace, replace_values))

df['FROM_ID'] = df['FROM_ID'].replace(replace_dict)

Answering to the question in the comment, assume you create a csv with two columns: "To_Replace" and "Replace_Value" 回答评论中的问题,假定您创建具有两列的csv:“ To_Replace”和“ Replace_Value”

     To_Replace    Replace_Value
0          X             Y
1          W             Z
2          A             B

You can create the replace_dict used in the script above like this: 您可以创建上面脚本中使用的replace_dict,如下所示:

import pandas as pd

replace_file = pd.read_csv(r'C:\Users\flabriol\Desktop\example_so.csv')
replace_dict = dict(zip(replace_file['To_Replace'], replace_file['Replace_Value']))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM