[英]Pandas dataframe output formatting
I'm importing a trade list and trying to consolidate it into a position file with summed quantities and average prices. 我正在导入一个贸易清单,并试图将其合并到具有总数量和平均价格的头寸文件中。 I'm grouping based on (ticker, type, expiration and strike).
我基于(股票代号,类型,到期日和行使价)进行分组。 Two questions:
两个问题:
Dataframe: 数据框:
GM stock 1 1 32 100
AAPL call 201612 120 3.5 1000
AAPL call 201612 120 3.25 1000
AAPL call 201611 120 2.5 2000
AAPL put 201612 115 2.5 500
AAPL stock 1 1 117 100
Code: 码:
import pandas as pd
import numpy as np
df = pd.read_csv(input_file, index_col=['ticker', 'type', 'expiration', 'strike'], names=['ticker', 'type', 'expiration', 'strike', 'price', 'quantity'])
df_output = df.groupy(df.index).agg({'price':np.mean, 'quantity':np.sum})
df_output.to_csv(output_file, sep=',')
csv output comes out in this format: csv输出以以下格式输出:
(ticker, type, expiration, strike), price, quantity
desired format: 所需格式:
ticker, type, expiration, strike, price, quantity
For the first question, you should use groupby(df.index_col) instead of groupby(df.index) 对于第一个问题,您应该使用groupby(df.index_col)而不是groupby(df.index)
For the second, I am not sure why you couldn't preserve "", is that numeric? 第二,我不确定为什么您不能保留“”,是数字吗?
I mock some data like below: 我模拟了一些如下数据:
import pandas as pd
import numpy as np
d = [
{'ticker':'A', 'type':'M', 'strike':'','price':32},
{'ticker':'B', 'type':'F', 'strike':100,'price':3.5},
{'ticker':'C', 'type':'F', 'strike':'', 'price':2.5}
]
df = pd.DataFrame(d)
print df
#dgroup = df.groupby(['ticker', 'type']).agg({'price':np.mean})
df.index_col = ['ticker', 'type', 'strike']
dgroup = df.groupby(df.index_col).agg({'price':np.mean})
#dgroup = df.groupby(df.index).agg({'price':np.mean})
print dgroup
print type(dgroup)
dgroup.to_csv('check.csv')
output in check.csv: 在check.csv中输出:
ticker,type,strike,price
A,M,,32.0
B,F,100,3.5
C,F,,2.5
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.