簡體   English   中英

分組並聚合 pandas dataframe 中的值

[英]Group by and aggregate the values in pandas dataframe

我在 python 中關注 dataframe

    meddra_id   meddra_label              soc       cross_ref                       soc_term
2   10000081    Abdominal pain            10017947  http://snomed.info/id/21522001  Gastrointestinal disorders
3   10017999    Gastrointestinal pain     10017947  http://snomed.info/id/21522001  Gastrointestinal disorders
15  10000340    Abstains from alcohol     10041244  http://snomed.info/id/105542008 Social circumstances
35  10001022    Acute psychosis           10037175  http://snomed.info/id/69322001  Psychiatric disorders
36  10061920    Psychotic disorder        10037175  http://snomed.info/id/69322001  Psychiatric disorders

我想使用按另一列“cross_ref”的分組來聚合“meddra_id、meddra_label、soc 和 soc_term”列中的值(並排除存在與“cross_ref”關聯的單個“meddra_id”的行)。

預期的 output 是:

meddra_id           meddra_label                           soc      cross_ref                       soc_term
10000081,10017999   Abdominal pain,Gastrointestinal pain   10017947 http://snomed.info/id/21522001  Gastrointestinal disorders
10001022,10061920   Acute psychosis,Psychotic disorder     10037175 http://snomed.info/id/69322001  Psychiatric disorders

我正在嘗試以下代碼行。

df_terms = df.groupby('cross_ref').filter(lambda g: len(g) > 1).drop_duplicates(subset=['meddra_id', 'meddra_label', 'soc', 'soc_term'], keep="first")

#aggregate the values
df_terms = df_terms.groupby('cross_ref')['meddra_id', 'meddra_label', 'soc', 'soc_term'].agg(' , '.join).reset_index()

當我嘗試聚合該值時,“soc_term”列未顯示在新的 dataframe (df_terms) 中

非常感謝任何幫助。

使用agg連接不同列中的值:

df_grouped = df.groupby('cross_ref') #group as you did
df_filtered = df_grouped.filter(lambda g: len(g['meddra_id'].unique()) > 1) # filter it for single values

df_aggregated = df_filtered.groupby('cross_ref').agg({
    'meddra_id': ', '.join,
    'meddra_label': ', '.join,
    'soc': lambda x: ', '.join(map(str, x)), # convert float values to strings
    'soc_term': lambda x: ', '.join(map(str, x)) # convert float values to strings
}).reset_index() #aggregate to join values in the different columns via a comma

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM