您如何只以熊貓歸還一組？

Question

我有以下腳本，我想要一個簡單的分組依據：

# import the pandas module
import pandas as pd
from openpyxl import load_workbook

writer = pd.ExcelWriter(r'D:\temp\test.xlsx', engine='openpyxl')
# Create an example dataframe
raw_data = {'Date': ['2016-05-13', '2016-05-13', '2016-05-13', '2016-05-13', '2016-05-13','2016-05-13', '2016-05-13', '2016-05-13', '2016-05-13', '2016-05-13', '2016-05-13', '2016-05-13', '2016-05-13', '2016-05-13', '2016-05-13', '2016-05-13', '2016-05-13'],
        'Portfolio': ['A', 'A', 'A', 'A', 'A', 'A', 'B', 'B','B', 'B', 'B', 'C', 'C', 'C', 'C', 'C', 'C'],
        'Duration': [1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3],
        'Yield': [0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1],}

df = pd.DataFrame(raw_data, columns = ['Date', 'Portfolio', 'Duration', 'Yield'])

dft = df.groupby(['Date', 'Portfolio', 'Duration', 'Yield'], as_index =False)

這將按對象創建一個熊貓組。

然后，我想將其輸出到excel：

dft.to_excel(writer, 'test', index=False)
writer.save()

但是它返回一個錯誤：

AttributeError: Cannot access callable attribute 'to_excel' of 'DataFrameGroupBy' objects, try using the 'apply' method

我為什么需要申請？ 我只希望按結果分組以刪除重復項。

Answer 1

實際上，您可以使用groupby刪除重復項，方法是取每個組的第一個或均值，例如：

df.groupby(['Date', 'Portfolio', 'Duration', 'Yield'], as_index=False).mean()
df.groupby(['Date', 'Portfolio', 'Duration', 'Yield'], as_index=False).first()

請注意，您必須應用一個函數（在這種情況下，使用mean或first方法）才能從groupby對象獲取DataFrame。 然后可以將其寫入excel。

但是正如@EdChum所指出的，在這種情況下，使用數據幀的drop_duplicates方法是更簡單的方法：

df.drop_duplicates(subset=['Date', 'Portfolio', 'Duration', 'Yield'])

您如何只以熊貓歸還一組？

問題描述

1 個解決方案

解決方案1
2 已采納 2016-05-17 12:39:04

您如何只以熊貓歸還一組？

問題描述

1 個解決方案

解決方案1 2 已采納 2016-05-17 12:39:04

解決方案1
2 已采納 2016-05-17 12:39:04