![](/img/trans.png)
[英]how to perform calculations across specific rows and columns of a crosstabulation in pandas?
[英]perform math calculations across two columns in pandas dataframe with one query?
我的問題與之前的問題有關,可能太長了。
因此,我將其分解為簡短的組件。
我想對pandas數據框中的多個列進行一些計算。
我的桌子:
id1 date_time adress a_size
reom 2005-8-20 22:51:10 75157.5413 ceifwekd
reom 2005-8-20 1:01:25 3571.37946 ceifwekd
reom 2005-8-20 11:21:01 3571.37946 tnohcve
reom 2005-8-20 8:29:09 97439.219 tnohcve
penr 2005-8-20 17:07:16 97439.219 ceifwekd
penr 2005-8-20 9:10:37 7391.6258 ceifwekd
我需要找到比例
total number of date_time / distinct number of a_size
for each id1
我可以這樣做
df1 = df.groupby(['id1'])['date_time'].count().to_frame('nums').reset_index()
df2 = df.groupby(['id1'])['a_size'].nunique().to_frame('dist_num_a_size').reset_index()
new_df = pd.merge(df1, df2, on = 'id1', how = 'inner')
new_df['ratio'] = new_df['nums']/new_df['dist_num_a_size']
如何在熊貓查詢中做到這一點?
謝謝
您可以將groupby.apply
與自己定義的lambda function
:
new_df = df.groupby('id1').apply(lambda x: x['date_time'].count() / x['a_size'].nunique())\
.reset_index()\
.rename({0:'ratio'},axis=1)
print(new_df)
id1 ratio
0 penr 2.0
1 reom 2.0
df['ratio'] = df['id1'].map(df.groupby('id1')\
.apply(lambda x: x['date_time'].count() / x['a_size'].nunique()))
id1 date_time a_size ratio
0 reom 2005-8-20 ceifwekd 2.0
1 reom 2005-9-20 ceifwekd 2.0
2 reom 2005-10-20 tnohcve 2.0
3 reom 2005-11-20 tnohcve 2.0
4 penr 2005-12-20 ceifwekd 2.0
5 penr 2005-13-20 ceifwekd 2.0
您可以將groupby
與lambda語句一起使用,然后將其映射回您的id1
您可以使用transform
group = df.groupby(['id1'])
df['ratio'] = group['date_time'].transform('count') / group['a_size'].transform('nunique')
id1 date_time adress a_size ratio
0 reom 2005-8-20 22:51:10 75157.54130 ceifwekd 2.0
1 reom 2005-8-20 1:01:25 3571.37946 ceifwekd 2.0
2 reom 2005-8-20 11:21:01 3571.37946 tnohcve 2.0
3 reom 2005-8-20 8:29:09 97439.21900 tnohcve 2.0
4 penr 2005-8-20 17:07:16 97439.21900 ceifwekd 2.0
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.