Pandas 删除逗号分隔列值中的特定 int 值

Question

I have a dataframe with column values separated by comma.我有一个 dataframe，列值用逗号分隔。 I want to remove certain values from those values.我想从这些值中删除某些值。

My dataframe looks like this:我的 dataframe 看起来像这样：

  col1             col2  
0,1,0,2,30,10,20  0,0,2,3,10,20
0,0,0,1,0,210,30  0,0,20,20,20,0,0,0

I want to remove 0,1,2 from column我想从列中删除 0,1,2

Output should be: Output 应该是：

 col1             col2                new_col1  new_col2
0,1,0,2,30,10,20  0,0,2,3,10,20       30,10,20   3,10,20
0,0,0,1,0,210,30  0,0,20,20,20,0,0,0   210,30    20,20,20

I tried我试过

def mysub(r):

     lst = [float(a) for a in r.split(',') if a != '0' and a != '' and  a != "1" and  a != "2"]
     return lst
df['new_col1']=df[df['col1']].mysub()

I am not able fix my problem - help me to sort it.我无法解决我的问题 - 帮助我解决问题。

Answer 1

Use list comprehension with specified values for remove in list:使用具有指定值的列表推导式以在列表中删除：

def mysub(r):
    return [','.join(z for z in str(y).split(',') 
            if z not in ['0','1','2']) for y in r]
df = df.apply(mysub)
print (df)
       col1      col2
0  30,10,20   3,10,20
1    210,30  20,20,20

For new columns:对于新列：

def mysub(r):
    return [','.join(z for z in str(y).split(',') 
            if z not in ['0','1','2']) for y in r]
df = df.join(df.apply(mysub).add_prefix('new_'))
print (df)
               col1                col2  new_col1  new_col2
0  0,1,0,2,30,10,20       0,0,2,3,10,20  30,10,20   3,10,20
1  0,0,0,1,0,210,30  0,0,20,20,20,0,0,0    210,30  20,20,20

If want floats output:如果想要浮动 output：

def mysub(r):
    return [[float(z) for z in str(y).split(',') 
            if z not in ['0','1','2']] for y in r]
df = df.join(df.apply(mysub).add_prefix('new_'))
print (df)
               col1                col2            new_col1  \
0  0,1,0,2,30,10,20       0,0,2,3,10,20  [30.0, 10.0, 20.0]   
1  0,0,0,1,0,210,30  0,0,20,20,20,0,0,0       [210.0, 30.0]   

             new_col2  
0   [3.0, 10.0, 20.0]  
1  [20.0, 20.0, 20.0]

Answer 2

df.applymap(lambda x: ','.join([e for e in x.split(',') if e not in ['0','1','2']]))

Pandas 删除逗号分隔列值中的特定 int 值

问题描述

2 个解决方案

解决方案1
1 2020-05-17 08:27:22

解决方案2
0 2020-05-17 08:34:15

Pandas 删除逗号分隔列值中的特定 int 值

问题描述

2 个解决方案

解决方案1 1 2020-05-17 08:27:22

解决方案2 0 2020-05-17 08:34:15

解决方案1
1 2020-05-17 08:27:22

解决方案2
0 2020-05-17 08:34:15