簡體   English   中英

如何使用 df.iterrows 基於條件 if 語句對 pandas 數據幀進行切片並創建新的 dataframe

[英]How to slice a pandas data frame based on a conditional if statement using df.iterrows and create new dataframe

我有一個從多個 excel 文件生成的大數據幀。 我想逐行切片數據幀並根據“樣本名稱”列的條件生成單獨的數據幀。 我要切片的數據框如下所示:

    Well Position Sample Name  Target Name  CT
0              A1       human      52928.0  40
1              A2       mouse      52928.0  32
2              A3         rat      52928.0  40
3              A4       human      52928.0  40
4              A5       human      52928.0  35

源 excel 文件可能包含也可能不包含所有三個物種的數據。 例如,它們可能是所有人類樣本、所有小鼠樣本或所有大鼠樣本。

我想要的結果是:

human_df 
    Well Position Sample Name  Target Name  CT
0              A1       human      52928.0  40
1              A4       human      52928.0  40
2              A5       human      52928.0  35

rat_df
    Well Position Sample Name  Target Name  CT
0              A3         rat      52928.0  40

mouse_df
    Well Position Sample Name  Target Name  CT
0              A2       mouse      52928.0  32

我嘗試執行此 function 是:

for i,row in data.iterrows():
            if row['Sample Name'] in data.iterrows() == 'mouse' or 'Mouse' or 'MOUSE':
                species = 'mouse'
                #make a new df_mouse
                df_mouse = data[(data['Sample Name'] == species)] 

            if row['Sample Name'] in data.iterrows() == 'human' or 'Human' or 'HUMAN':
                species = 'human'
                df_human = data[(data['Sample Name'] == species)]
                print("Human Dataframe = ", df_human)

            if row['Sample Name'] in data.iterrows() == 'rat' or 'Rat' or 'RAT':
                species = 'rat'
                df_rat = data[(data['Sample Name'] == species)]
                print("Rat Dataframe = ", df_rat)

這在一定程度上有效,但當三個物種之一不在原始 excel 文件中時會失敗。 在此先感謝您的幫助。

在列Sample Name上使用DataFrame.groupby並使用 dict comprehension 將每個分組的 df 存儲在字典dct中,以引用存儲的 Z6A8064B5DF4794555500553C47C55057DZ 只需使用dct['name_of_df']

dct = {f'{k}_df': g.reset_index(drop=True) for k, g in df.groupby('Sample Name')}

結果:

# dct['human_df']
  Well Position Sample Name  Target Name  CT
0            A1       human      52928.0  40
1            A4       human      52928.0  40
2            A5       human      52928.0  35

# dct['rat_df']
    Well Position Sample Name  Target Name  CT
0              A3         rat      52928.0  40

# dct['mouse_df']
    Well Position Sample Name  Target Name  CT
0              A2       mouse      52928.0  32

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM