如何通过使用 pandas 查找另一个 DataFrame 在 DataFrame 列中创建列表？

Question

我有一个来自DataFrame的 DataFrame，名为df ，它有多个列（仅在下面显示 3）和 90,000 行：

        Key        Date     Rating
0      123abc   08/19/2015    A
1      456def   04/23/2013    B-
2      123abc   06/10/2012    C
3      789ghi   01/04/2017    B
.        .           .        .
.        .           .        .
90000  999zzz   12/12/2012    D

我想创建一个单独的DataFrame ， df_ratings ，它有两列： Key和Rating List 。 在df_ratings中， Key列必须是唯一的，并且Rating List列应包含针对df中的Key出现的所有Ratings的列表。

        Key       Rating List
0      123abc     ['A', 'C']
1      456def       ['B-']
2      789ghi     ['B', 'D']
.        .            .
.        .            .
30000  999zzz   ['A', 'C+', 'D']

到目前为止我使用的方法是：

df_zip = list(zip(df['Key'], df['Rating']))

def dfRatingsList(row):
    rating = []
    for x, y in df_zip:
        if row['Key'] == x:
            rating.append(y)
    return rating

df_ratings = pd.DataFrame(df['Key'].unique(), columns=['Key'])
df_ratings = df_ratings.fillna('NULL')
df_ratings['Rating List'] = df_ratings.apply(dfRatingsList, axis=1)

鉴于我的数据集的大小，这需要几个小时才能运行。 我怎样才能加快这个过程/改进我的代码？

Answer 1

尝试这个：

df = df.groupby(by=['Key'], as_index=False).agg({'Rating': list})
print(df)

      Key        Rating
0  123abc  [A, A, A, A]
1  123def           [C]
2  456def          [B-]
3  789ghi           [B]
4  999zzz           [D]

如何通过使用 pandas 查找另一个 DataFrame 在 DataFrame 列中创建列表？

问题描述

1 个解决方案

解决方案1
0 已采纳 2020-05-27 14:52:40

如何通过使用 pandas 查找另一个 DataFrame 在 DataFrame 列中创建列表？

问题描述

1 个解决方案

解决方案1 0 已采纳 2020-05-27 14:52:40

解决方案1
0 已采纳 2020-05-27 14:52:40