繁体   English   中英

Pandas 分配与应用 lambda 多列与条件

[英]Pandas assign with apply lambda multiple columns with condition

我正在寻找正确的方法将 label 替换为我的 dataframe 中的匹配列,但我没有让代码正常工作。 有什么解决办法吗?

MY DATAFRAME

        labItemsNameRef     label
0       FBS                 decrease
1       FBS                 decrease
2       FBS                 increase
3       HbA1c               decrease
4       Creatinine          changeless
...    ...                  ...
123901  FBS                 decrease
123902  HbA1c               increase
123903  Micro Creatinine    changeless
123904  DTX ก่อนอาหาร       increase
123905  Urine Creatinine    changeless
df = df.assign(
     FBS = lambda df: df.apply(lambda x: x['label'] if x['labItemsNameRef'] == 'FBS'),
     HbA1c = lambda df: df.apply(lambda x: x['label'] if x['labItemsNameRef'] == 'HbA1c'),
     DTX = lambda df: df.apply(lambda x: x['label'] if x['labItemsNameRef'] == 'DTX'),
     BUN = lambda df: df.apply(lambda x: x['label'] if x['labItemsNameRef'] == 'BUN'),
     Creatinine = lambda df: df.apply(lambda x: x['label'] if x['labItemsNameRef'] == 'Creatinine'))

但我收到了这个错误

    FBX = lambda df: df.apply(lambda x: x['label'] if x['labItemsNameRef'] == 'FBX'),
                                                                                   ^
SyntaxError: invalid syntax
EXPECTED OUTPUT

       labItemsNameRef  label       FBS      HbA1c    Creatinine BUN DTX
0      FBS              decrease    decrease NaN      NaN        NaN    NaN
1      FBS              decrease    decrease NaN      NaN        NaN    NaN
2      FBS              increase    increase NaN      NaN        NaN    NaN
3      HbA1c            decrease    NaN      decrease NaN        NaN    NaN
4      Creatinine       changeless  NaN      NaN      changeless NaN    NaN
...     ...               ...       ...      ...      ...   ... ...
123901 FBS              decrease    decrease NaN      NaN        NaN    NaN
123902 HbA1c            increase    NaN      increase NaN        NaN    NaN
123903 Micro Creatinine changeless  NaN      NaN      NaN        NaN    NaN
123904 DTX ก่อนอาหาร     increase    NaN      NaN      NaN        NaN    NaN
123905 Urine Creatinine changeless  NaN      NaN      NaN        NaN    NaN

使用get_dummies作为指标列并在label中设置numpy.where的值:

m = pd.get_dummies(df['labItemsNameRef'], dtype=bool)
df[m.columns] = np.where(m, df[['label']], np.nan)
print (df)

您的解决方案很慢,因为应用中的循环,但可以使用 add else语句和axis=1

df = df.assign(FBS = lambda df: df.apply(lambda x: x['label'] if x['labItemsNameRef'] == 'FBS' else np.nan, axis=1))

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM