简体   繁体   English

Python:合并多个csv文件时需要新建一列

[英]Python: need to create a new column when merging multiple csv files

thanks for help in advance.提前感谢您的帮助。 multi-part question多部分问题

I have zip files that has multiple stock pricing info.我有包含多个股票定价信息的 zip 文件。 the current format is Header row is:当前格式是标题行是:

ticker,date,open,high,low,close,vol

and first row example is第一行的例子是

AAPL,201906030900,176.32,176.32,176.24,176.29,2247

desired format:所需格式:

header标题

ticker,date,time,open,high,low,close,vol

and data和数据

AAPL,20190603,09:00,176.32,176.32,176.24,176.29,2247

where the time column is added and the column is filled with the last 4 digits from the date row with a colon in the middle and those last 4 digits are removed from the date data column.其中添加了时间列,并用日期行的最后 4 位数字填充该列,中间有一个冒号,最后 4 位数字从日期数据列中删除。

there about 400 rows of data for each stock in each file so each row would need to be converted.每个文件中每只股票大约有 400 行数据,因此每一行都需要转换。

i haven't been able to find an answer here or elsewhere on the web that i could understand how to accomplish what i am trying to do.我无法在这里或网络上的其他地方找到我能理解如何完成我想要做的事情的答案。

Try the following, using pandas :尝试以下操作,使用pandas
data.csv数据.csv

ticker,date,open,high,low,close,vol
AAPL,201906030900,176.32,176.32,176.24,176.29,2247
ABCD,202002211000,220.97,217.38,221.43,219.82,8544

code代码

import pandas as pd

df = pd.read_csv('data.csv')

# print(df)

df['time'] = df['date'].apply(lambda x: f'{str(x)[-4:-2]}:{str(x)[-2:]}')
df['date'] = df['date'].apply(lambda x: str(x)[:-4])

cols = df.columns.to_list()
cols = cols[:2] + cols[-1:] + cols[2:-1]

df = df[cols]

# print(df)

df.to_csv('out.csv', index=False)

output.csv输出.csv

ticker,date,time,open,high,low,close,vol
AAPL,20190603,09:00,176.32,176.32,176.24,176.29,2247
ABCD,20200221,10:00,220.97,217.38,221.43,219.82,8544

You can use the same code to loop over multiple files.您可以使用相同的代码循环多个文件。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM