簡體   English   中英

pandas:當索引不唯一時使用 diff 和 groupby 的問題

[英]pandas: problem using diff with groupby when index is non-unique

我使用(0.20.3版)和我想申請的diff()與方法groupby()但不是一個數據幀,其結果是“下划線”。

這是代碼:

import numpy as np
import pandas as pd

# creating the DataFrame
data = np.random.random(18).reshape(6,3)
indexes = ['B']*3 + ['A']*3
columns = ['x', 'y', 'z']
df = pd.DataFrame(data, index=indexes, columns=columns)
df.index.name = 'chain_id'

# Now I want to apply the diff method in function of the chain_id
df.groupby('chain_id').diff()

結果是一個下划線!

請注意, df.loc['A'].diff()df.loc['B'].diff()確實返回了預期的結果,所以我不明白為什么它不能與groupby()

IIUC,您的錯誤:無法從重復軸重新索引

df.reset_index().groupby('chain_id').diff().set_index(df.index)
Out[859]: 
                 x         y         z
chain_id                              
B              NaN       NaN       NaN
B        -0.468771  0.192558 -0.443570
B         0.323697  0.288441  0.441060
A              NaN       NaN       NaN
A        -0.198785  0.056766  0.081513
A         0.138780  0.563841  0.635097

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM