簡體   English   中英

Pandas 分配與應用 lambda 多列與條件

[英]Pandas assign with apply lambda multiple columns with condition

我正在尋找正確的方法將 label 替換為我的 dataframe 中的匹配列,但我沒有讓代碼正常工作。 有什么解決辦法嗎?

MY DATAFRAME

        labItemsNameRef     label
0       FBS                 decrease
1       FBS                 decrease
2       FBS                 increase
3       HbA1c               decrease
4       Creatinine          changeless
...    ...                  ...
123901  FBS                 decrease
123902  HbA1c               increase
123903  Micro Creatinine    changeless
123904  DTX ก่อนอาหาร       increase
123905  Urine Creatinine    changeless
df = df.assign(
     FBS = lambda df: df.apply(lambda x: x['label'] if x['labItemsNameRef'] == 'FBS'),
     HbA1c = lambda df: df.apply(lambda x: x['label'] if x['labItemsNameRef'] == 'HbA1c'),
     DTX = lambda df: df.apply(lambda x: x['label'] if x['labItemsNameRef'] == 'DTX'),
     BUN = lambda df: df.apply(lambda x: x['label'] if x['labItemsNameRef'] == 'BUN'),
     Creatinine = lambda df: df.apply(lambda x: x['label'] if x['labItemsNameRef'] == 'Creatinine'))

但我收到了這個錯誤

    FBX = lambda df: df.apply(lambda x: x['label'] if x['labItemsNameRef'] == 'FBX'),
                                                                                   ^
SyntaxError: invalid syntax
EXPECTED OUTPUT

       labItemsNameRef  label       FBS      HbA1c    Creatinine BUN DTX
0      FBS              decrease    decrease NaN      NaN        NaN    NaN
1      FBS              decrease    decrease NaN      NaN        NaN    NaN
2      FBS              increase    increase NaN      NaN        NaN    NaN
3      HbA1c            decrease    NaN      decrease NaN        NaN    NaN
4      Creatinine       changeless  NaN      NaN      changeless NaN    NaN
...     ...               ...       ...      ...      ...   ... ...
123901 FBS              decrease    decrease NaN      NaN        NaN    NaN
123902 HbA1c            increase    NaN      increase NaN        NaN    NaN
123903 Micro Creatinine changeless  NaN      NaN      NaN        NaN    NaN
123904 DTX ก่อนอาหาร     increase    NaN      NaN      NaN        NaN    NaN
123905 Urine Creatinine changeless  NaN      NaN      NaN        NaN    NaN

使用get_dummies作為指標列並在label中設置numpy.where的值:

m = pd.get_dummies(df['labItemsNameRef'], dtype=bool)
df[m.columns] = np.where(m, df[['label']], np.nan)
print (df)

您的解決方案很慢,因為應用中的循環,但可以使用 add else語句和axis=1

df = df.assign(FBS = lambda df: df.apply(lambda x: x['label'] if x['labItemsNameRef'] == 'FBS' else np.nan, axis=1))

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM