Python Pandas: DataFrame filter negative values

Question

I was wondering how I can remove all indexes that containing negative values inside their column. I am using Pandas DataFrames .

Documentation Pandas DataFrame

Format:

Myid - valuecol1 - valuecol2 - valuecol3 -... valuecol30

So my DataFrame is called data

I know how to do this for 1 column:

data2 = data.index[data['valuecol1'] > 0]
data3 = data.ix[data3]

So I only get the ids where valuecol1 > 0 , how can I do some kind of and statement?

valuecol1 && valuecol2 && valuecol3 && ... && valuecol30 > 0 ?

Answer 1

You can use all to check an entire row or column is True:

In [11]: df = pd.DataFrame(np.random.randn(10, 3))

In [12]: df
Out[12]:
          0         1         2
0 -1.003735  0.792479  0.787538
1 -2.056750 -1.508980  0.676378
2  1.355528  0.307063  0.369505
3  1.201093  0.994041 -1.169323
4 -0.305359  0.044360 -0.085346
5 -0.684149 -0.482129 -0.598155
6  1.795011  1.231198 -0.465683
7 -0.632216 -0.075575  0.812735
8 -0.479523 -1.900072 -0.966430
9 -1.441645 -1.189408  1.338681

In [13]: (df > 0).all(1)
Out[13]:
0    False
1    False
2     True
3    False
4    False
5    False
6    False
7    False
8    False
9    False
dtype: bool

In [14]: df[(df > 0).all(1)]
Out[14]:
          0         1         2
2  1.355528  0.307063  0.369505

If you only want to look at a subset of the columns, eg [0, 1] :

In [15]: df[(df[[0, 1]] > 0).all(1)]
Out[15]:
          0         1         2
2  1.355528  0.307063  0.369505
3  1.201093  0.994041 -1.169323
6  1.795011  1.231198 -0.465683

Answer 2

You could loop over the column names

for cols in data.columns.tolist()[1:]:
    data = data.ix[data[cols] > 0]

Answer 3

To use and statements inside a data-frame you just have to use a single & character and separate each condition with parenthesis.

For example:

data = data[(data['col1']>0) & (data['valuecol2']>0) & (data['valuecol3']>0)]

Answer 4

If you want to check the values of an adjacent group of columns, for example from the second to the tenth:

df[(df.ix[:,2:10] > 0).all(1)]

You can also use a range

df[(df.ix[:,range(1,10,3)] > 0).all(1)]

and an own list of indices

mylist=[1,2,4,8]
df[(df.ix[:, mylist] > 0).all(1)]

Python Pandas: DataFrame filter negative values

Question

4 answers

solution1
19 2014-06-14 00:28:26

solution2
6 ACCPTED 2014-06-13 23:20:03

solution3
2 2017-08-09 17:58:30

solution4
1 2017-04-25 21:38:55

Python Pandas: DataFrame filter negative values

Question

4 answers

solution1 19 2014-06-14 00:28:26

solution2 6 ACCPTED 2014-06-13 23:20:03

solution3 2 2017-08-09 17:58:30

solution4 1 2017-04-25 21:38:55

solution1
19 2014-06-14 00:28:26

solution2
6 ACCPTED 2014-06-13 23:20:03

solution3
2 2017-08-09 17:58:30

solution4
1 2017-04-25 21:38:55