繁体   English   中英

移位特定字符串熊猫df

[英]Shift specific strings pandas df

我试图转移特定stringspandas dfrow 这些strings位于相同或相邻的列中。

下面的df是一个示例。 指定的字符串是CatDog 我想这些值上移一个row 这些值在Column C Column DColumn D

import pandas as pd 

d = ({
    'A' : ['A','Yy','A','Xy','A','Zy','Yy'],
    'B' : ['Big','X','Big','X','Very','X','X'],           
    'C' : ['','Cat','YY','Dog','Big','XY','YY'],
    'D' : ['','','Xy','Yy','','Cat','Yy'],
    'E' : ['','','Xy','XX','','','Xy'],           
    })

df = pd.DataFrame(data=d)

我的预期输出是

    A     B    C    D   E
0   A   Big  Cat         
1  Yy     X              
2   A   Big  Dog   Xy  Xy
3  Xy     X        Yy  XX
4   A  Very  Big  Cat    
5  Zy     X   XY         
6  Yy     X   YY   Yy  Xy

我努力了:

df['C'] = df['C'].shift(-1)

但这会使所有价值上移。 我只想在某些列中选择特定的值(例如CatDog )并将它们向上移动一行。

我当时正在考虑列出指定值,然后将其上移。 就像是

val = ['Cat','Dog']

if val is in df['C',D'].shift up one row

注意:我无法根据周围的字符串对此进行排序。 我实际的df包含各种不同的字符串,需要花费很长时间才能通过。

在这种情况下,请执行以下操作:

df['C'][0],df['C'][1] = df['C'][1],df['C'][0] # swap the index
df['D'] = df['D'].shift(-1).fillna('X')
print(df)

输出:

     A    B       C      D  E
0    A  Big     Cat          
1    X    X                  
2    X    X       X      X  X
3    X    X       X      X  X
4  Foo  Bar  Foobar  Fubur   
5    X    X       X          
6    X    X       X      X  X

对于通用解决方案, np.where() Pandas eq()np.where()

import numpy as np

def shift_value(df, value):
    row, col = np.where(df.eq(value))
    old_row = row[0]
    old_col = col[0]
    new_row = old_row - 1
    new_col = old_col
    df.iat[new_row, new_col] = value
    df.iat[old_row, old_col] = "X"

for v in ["Cat", "Foobar"]:
    shift_value(df, v)

df
     A    B       C      D  E
0    A  Big     Cat          
1    X    X       X          
2    X    X       X      X  X
3    X    X  Foobar      X  X
4  Foo  Bar       X          
5    X    X       X  Fubur   
6    X    X       X      X  X

原始OP数据:

d = ({
    'A' : ['A','X','X','X','Foo','X','X'],
    'B' : ['Big','X','X','X','Bar','X','X'],           
    'C' : ['','Cat','X','X','Foobar','X','X'],
    'D' : ['','','X','X','','Fubur','X'],
    'E' : ['','','X','X','','','X'],           
    })

df = pd.DataFrame(data=d)

如果您需要的是该行中所有具有单个有意义单词的值都需要移位,那么这应该是一个答案:

In [36]: import pandas as pd
    ...: d = ({
    ...:     'A' : ['A','X','X','X','Foo','X','X'],
    ...:     'B' : ['Big','X','X','X','Bar','X','X'],
    ...:     'C' : ['','Cat','X','X','Foobar','X','X'],
    ...:     'D' : ['','','X','X','','Fubur','X'],
    ...:     'E' : ['','','X','X','','','X'],
    ...:     })
    ...: df = pd.DataFrame(data=d)
    ...:
    ...: index = ((df!='X') & (df!='') & df.notna()).sum(axis=1) == 1
    ...: for row in df[index].index.values:
    ...:     for col in df.columns.values:
    ...:         if df.loc[row, col]!='X' and bool(df.loc[row, col]):
    ...:             df.loc[row-1, col] = df.loc[row, col]
    ...:             df.loc[row, col] = ''
    ...:

In [37]: df
Out[37]:
     A    B       C      D  E
0    A  Big     Cat
1    X    X
2    X    X       X      X  X
3    X    X       X      X  X
4  Foo  Bar  Foobar  Fubur
5    X    X       X
6    X    X       X      X  X

因此,如果数据不是太大,可以尝试for循环:

for row in range(1, len(df)):
    for col in df.columns.values:
        if (df.loc[row, col] != '') and (df.loc[row-1, col] == ''):
            df.loc[row-1, col] = df.loc[row, col]
            df.loc[row, col] = '######'
df = df.replace('######', '')

我认为您需要df.combine_first

mylist=['Cat','Dog']
a=df[df.isin(mylist)].shift(-1)
df[df.isin(mylist)]=""
out_df=a.combine_first(df)
print(out_df)
    A     B    C    D   E
0   A   Big  Cat         
1  Yy     X              
2   A   Big  Dog   Xy  Xy
3  Xy     X        Yy  XX
4   A  Very  Big  Cat    
5  Zy     X   XY         
6  Yy     X   YY   Yy  XyX

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM