簡體   English   中英

移位特定字符串熊貓df

[英]Shift specific strings pandas df

我試圖轉移特定stringspandas dfrow 這些strings位於相同或相鄰的列中。

下面的df是一個示例。 指定的字符串是CatDog 我想這些值上移一個row 這些值在Column C Column DColumn D

import pandas as pd 

d = ({
    'A' : ['A','Yy','A','Xy','A','Zy','Yy'],
    'B' : ['Big','X','Big','X','Very','X','X'],           
    'C' : ['','Cat','YY','Dog','Big','XY','YY'],
    'D' : ['','','Xy','Yy','','Cat','Yy'],
    'E' : ['','','Xy','XX','','','Xy'],           
    })

df = pd.DataFrame(data=d)

我的預期輸出是

    A     B    C    D   E
0   A   Big  Cat         
1  Yy     X              
2   A   Big  Dog   Xy  Xy
3  Xy     X        Yy  XX
4   A  Very  Big  Cat    
5  Zy     X   XY         
6  Yy     X   YY   Yy  Xy

我努力了:

df['C'] = df['C'].shift(-1)

但這會使所有價值上移。 我只想在某些列中選擇特定的值(例如CatDog )並將它們向上移動一行。

我當時正在考慮列出指定值,然后將其上移。 就像是

val = ['Cat','Dog']

if val is in df['C',D'].shift up one row

注意:我無法根據周圍的字符串對此進行排序。 我實際的df包含各種不同的字符串,需要花費很長時間才能通過。

在這種情況下,請執行以下操作:

df['C'][0],df['C'][1] = df['C'][1],df['C'][0] # swap the index
df['D'] = df['D'].shift(-1).fillna('X')
print(df)

輸出:

     A    B       C      D  E
0    A  Big     Cat          
1    X    X                  
2    X    X       X      X  X
3    X    X       X      X  X
4  Foo  Bar  Foobar  Fubur   
5    X    X       X          
6    X    X       X      X  X

對於通用解決方案, np.where() Pandas eq()np.where()

import numpy as np

def shift_value(df, value):
    row, col = np.where(df.eq(value))
    old_row = row[0]
    old_col = col[0]
    new_row = old_row - 1
    new_col = old_col
    df.iat[new_row, new_col] = value
    df.iat[old_row, old_col] = "X"

for v in ["Cat", "Foobar"]:
    shift_value(df, v)

df
     A    B       C      D  E
0    A  Big     Cat          
1    X    X       X          
2    X    X       X      X  X
3    X    X  Foobar      X  X
4  Foo  Bar       X          
5    X    X       X  Fubur   
6    X    X       X      X  X

原始OP數據:

d = ({
    'A' : ['A','X','X','X','Foo','X','X'],
    'B' : ['Big','X','X','X','Bar','X','X'],           
    'C' : ['','Cat','X','X','Foobar','X','X'],
    'D' : ['','','X','X','','Fubur','X'],
    'E' : ['','','X','X','','','X'],           
    })

df = pd.DataFrame(data=d)

如果您需要的是該行中所有具有單個有意義單詞的值都需要移位,那么這應該是一個答案:

In [36]: import pandas as pd
    ...: d = ({
    ...:     'A' : ['A','X','X','X','Foo','X','X'],
    ...:     'B' : ['Big','X','X','X','Bar','X','X'],
    ...:     'C' : ['','Cat','X','X','Foobar','X','X'],
    ...:     'D' : ['','','X','X','','Fubur','X'],
    ...:     'E' : ['','','X','X','','','X'],
    ...:     })
    ...: df = pd.DataFrame(data=d)
    ...:
    ...: index = ((df!='X') & (df!='') & df.notna()).sum(axis=1) == 1
    ...: for row in df[index].index.values:
    ...:     for col in df.columns.values:
    ...:         if df.loc[row, col]!='X' and bool(df.loc[row, col]):
    ...:             df.loc[row-1, col] = df.loc[row, col]
    ...:             df.loc[row, col] = ''
    ...:

In [37]: df
Out[37]:
     A    B       C      D  E
0    A  Big     Cat
1    X    X
2    X    X       X      X  X
3    X    X       X      X  X
4  Foo  Bar  Foobar  Fubur
5    X    X       X
6    X    X       X      X  X

因此,如果數據不是太大,可以嘗試for循環:

for row in range(1, len(df)):
    for col in df.columns.values:
        if (df.loc[row, col] != '') and (df.loc[row-1, col] == ''):
            df.loc[row-1, col] = df.loc[row, col]
            df.loc[row, col] = '######'
df = df.replace('######', '')

我認為您需要df.combine_first

mylist=['Cat','Dog']
a=df[df.isin(mylist)].shift(-1)
df[df.isin(mylist)]=""
out_df=a.combine_first(df)
print(out_df)
    A     B    C    D   E
0   A   Big  Cat         
1  Yy     X              
2   A   Big  Dog   Xy  Xy
3  Xy     X        Yy  XX
4   A  Very  Big  Cat    
5  Zy     X   XY         
6  Yy     X   YY   Yy  XyX

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM