[英]Python: need to create a new column when merging multiple csv files
thanks for help in advance.提前感谢您的帮助。 multi-part question
多部分问题
I have zip files that has multiple stock pricing info.我有包含多个股票定价信息的 zip 文件。 the current format is Header row is:
当前格式是标题行是:
ticker,date,open,high,low,close,vol
and first row example is第一行的例子是
AAPL,201906030900,176.32,176.32,176.24,176.29,2247
desired format:所需格式:
header标题
ticker,date,time,open,high,low,close,vol
and data和数据
AAPL,20190603,09:00,176.32,176.32,176.24,176.29,2247
where the time column is added and the column is filled with the last 4 digits from the date row with a colon in the middle and those last 4 digits are removed from the date data column.其中添加了时间列,并用日期行的最后 4 位数字填充该列,中间有一个冒号,最后 4 位数字从日期数据列中删除。
there about 400 rows of data for each stock in each file so each row would need to be converted.每个文件中每只股票大约有 400 行数据,因此每一行都需要转换。
i haven't been able to find an answer here or elsewhere on the web that i could understand how to accomplish what i am trying to do.我无法在这里或网络上的其他地方找到我能理解如何完成我想要做的事情的答案。
Try the following, using pandas
:尝试以下操作,使用
pandas
:
data.csv数据.csv
ticker,date,open,high,low,close,vol
AAPL,201906030900,176.32,176.32,176.24,176.29,2247
ABCD,202002211000,220.97,217.38,221.43,219.82,8544
code代码
import pandas as pd
df = pd.read_csv('data.csv')
# print(df)
df['time'] = df['date'].apply(lambda x: f'{str(x)[-4:-2]}:{str(x)[-2:]}')
df['date'] = df['date'].apply(lambda x: str(x)[:-4])
cols = df.columns.to_list()
cols = cols[:2] + cols[-1:] + cols[2:-1]
df = df[cols]
# print(df)
df.to_csv('out.csv', index=False)
output.csv输出.csv
ticker,date,time,open,high,low,close,vol
AAPL,20190603,09:00,176.32,176.32,176.24,176.29,2247
ABCD,20200221,10:00,220.97,217.38,221.43,219.82,8544
You can use the same code to loop over multiple files.您可以使用相同的代码循环多个文件。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.