簡體   English   中英

使用Loc更新熊貓數據框中的行無法正常工作

[英]Updating row in pandas dataframe using loc not working properly

我有一個名為output的數據框-

RAW_ENTITY_NAME   ENTITY_TYPE       ENTITY_NAME        IS_MAIN
01-03-2017        TNRMATDT          01 03 2017         1
04-02-2017        TNRSTRTDT         04 02 2017         1
documents         TNRTYPE           SIGHT              1
documents         TNRDOCSBY         NOT FOUND          1
accept            TNRDTL            accept             1 
23                TNRDAYS           23                 1

打印(df.dtypes())

RAW_ENTITY_NAME               object
ENTITY_TYPE                   object
ENTITY_NAME                   object
IS_MAIN                       object

注意ENTITY_TYPE = TNRTYPEENTITY_NAME = SIGHT AND IS_MAIN = 1在數據幀中只會出現一次。

如果ENTITY_TYPE為TNRTYPE,ENTITY_NAME = SIGHT AND IS_MAIN = 1,我想更新一些值。

temp = output.loc[(output['IS_MAIN'] == 1) & (output['ENTITY_TYPE'] == 'TNRTYPE'), 'ENTITY_NAME']
temp = temp.reset_index(drop=True)
temp = temp[0]
if (temp == 'SIGHT'):
   output.loc[(output['IS_MAIN'] == '1') & (output['ENTITY_TYPE'] == 'TNRDOCSBY'), 'ENTITY_NAME'] = 'PAYMENT'

   output.loc[(output['IS_MAIN'] == '1') & (output['ENTITY_TYPE'].isin(['TNRDTL'])),
                                   ['ENTITY_NAME', 'RAW_ENTITY_NAME']] = 'NOT APPLICABLE'

   output.loc[(output['IS_MAIN'] == '1') & (output['ENTITY_TYPE'].isin(['TNRDAYS'])),
                                   ['ENTITY_NAME']] = '0'

   output.loc[(output['IS_MAIN'] == '1') & (output['ENTITY_TYPE'].isin(['TNRDAYS'])),
                                   ['RAW_ENTITY_NAME']] = ''

   output.loc[(output['IS_MAIN'] == '1') & (output['ENTITY_TYPE']=='TNRSTRTDT'),
                                   ['ENTITY_NAME', 'RAW_ENTITY_NAME']] = ''

   output.loc[(output['IS_MAIN'] == '1') & (output['ENTITY_TYPE']=='TNRMATDT'),
                                   ['ENTITY_NAME', 'RAW_ENTITY_NAME']] = ''

最終輸出是-

RAW_ENTITY_NAME   ENTITY_TYPE       ENTITY_NAME        IS_MAIN
    01-03-2017        TNRMATDT          01 03 2017         1
    04-02-2017        TNRSTRTDT         04 02 2017         1
    documents         TNRTYPE           SIGHT              1
    documents         TNRDOCSBY         PAYMENT            1
    NOT APPLICABLE    TNRDTL            NOT APPLICABLE     1 
                      TNRDAYS           0                  1

如您所見,除前兩行外,所有內容都在更新,即ENTITY_TYPE = TNRMATDT和TNRSTRTDAT。

我想知道為什么下面的代碼沒有給出期望的結果。

output.loc[(output['IS_MAIN'] == '1') & (output['ENTITY_TYPE']=='TNRSTRTDT'),
                                   ['ENTITY_NAME', 'RAW_ENTITY_NAME']] = ''

output.loc[(output['IS_MAIN'] == '1') & (output['ENTITY_TYPE']=='TNRMATDT'),
                                       ['ENTITY_NAME', 'RAW_ENTITY_NAME']] = ''

如果有人可以發現我犯的錯誤或告訴我任何解決方法,我將很高興。

非常感謝。

對我來說,您的解決方案效果很好,我嘗試將其重寫以提高可讀性,並且不會重復相同的條件:

temp = output.loc[(output['IS_MAIN'] == '1') & 
                  (output['ENTITY_TYPE'] == 'TNRTYPE'), 'ENTITY_NAME']

#if values in IS_MAIN are integers
#temp = output.loc[(output['IS_MAIN'] == 1) & 
#                  (output['ENTITY_TYPE'] == 'TNRTYPE'), 'ENTITY_NAME']

if (temp.iat[0] == 'SIGHT'):
#more general working if not match condition
#if (next(iter(temp), 'not match') == 'SIGHT'):

    m1 = output['IS_MAIN'] == '1'
    #if values in IS_MAIN are integers
    #m1 = output['IS_MAIN'] == 1
    m2 = output['ENTITY_TYPE'] == 'TNRDOCSBY'
    m3 = output['ENTITY_TYPE'] == 'TNRDTL'
    m4 = output['ENTITY_TYPE'] == 'TNRDAYS'
    m5 = output['ENTITY_TYPE'].isin(['TNRMATDT','TNRSTRTDT'])

    output.loc[m1 & m2, 'ENTITY_NAME'] = 'PAYMENT'

    output.loc[m1 & m3, ['ENTITY_NAME', 'RAW_ENTITY_NAME']] = 'NOT APPLICABLE'

    output.loc[m1 & m4, ['ENTITY_NAME']] = '0'
    output.loc[m1 & m4, ['RAW_ENTITY_NAME']] = ''

    output.loc[m1 & m5, ['ENTITY_NAME', 'RAW_ENTITY_NAME']] = ''

print (output)
  RAW_ENTITY_NAME ENTITY_TYPE     ENTITY_NAME IS_MAIN
0                    TNRMATDT                       1
1                   TNRSTRTDT                       1
2       documents     TNRTYPE           SIGHT       1
3       documents   TNRDOCSBY         PAYMENT       1
4  NOT APPLICABLE      TNRDTL  NOT APPLICABLE       1
5                     TNRDAYS               0       1

我有同樣的問題。 您所要做的就是將IS_MAIN列設為數字

df['IS_MAIN'] = df['IS_MAIN'].astype(int)

這應該使它工作。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM