简体   繁体   English

删除线性增加的“计数”列熊猫

[英]Remove linearly increasing “count” columns pandas

I have a dataframe with some columns representing counts for every timestep, I would like to automatically drop these, for example like the df.dropna() functionality, but something like df.dropcounts() . 我有一个数据框,其中的某些列表示每个时间步长的计数,我想自动删除这些计数,例如df.dropna()功能,但类似df.dropcounts()

Here is an example dataframe 这是一个示例数据框

array = [[0.0,1.6,2.7,12.0],[1.0,3.5,4.5,13.0],[2.0,6.5,8.6,14.0]]
pd.DataFrame(array)

     0    1    2     3
0  0.0  1.6  2.7  12.0
1  1.0  3.5  4.5  13.0
2  2.0  6.5  8.6  14.0

I would like to drop the first and last columns 我想删除第一列和最后一列

I believe need: 我相信需要:

val = 1
df = df.loc[:, df.diff().fillna(val).ne(val).any()]
print (df)
     1    2
0  1.6  2.7
1  3.5  4.5
2  6.5  8.6

Explanation : 说明

First compare by DataFrame.diff : 首先通过DataFrame.diff比较:

print (df.diff())
     0    1    2    3
0  NaN  NaN  NaN  NaN
1  1.0  1.9  1.8  1.0
2  1.0  3.0  4.1  1.0

Replace NaN s: 替换NaN

print (df.diff().fillna(val))
     0    1    2    3
0  1.0  1.0  1.0  1.0
1  1.0  1.9  1.8  1.0
2  1.0  3.0  4.1  1.0

Compare if not equal by ne : 如果不相等比较ne

print (df.diff().fillna(val).ne(val))
       0      1      2      3
0  False  False  False  False
1  False   True   True  False
2  False   True   True  False

And chck at least one True per column by DataFrame.any : 并且通过DataFrame.any每列至少一个True

print (df.diff().fillna(val).ne(val).any())
0    False
1     True
2     True
3    False
dtype: bool

Using all all使用

d.loc[:,~d.diff().fillna(1).eq(1).all().values]
Out[295]: 
     1    2
0  1.6  2.7
1  3.5  4.5
2  6.5  8.6

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM