Pandas filter dataframe for positive and negative values

Question

I have a pandas dataframe with 3 columns where:

Category dtype - string
Date dtype - datetime

Values dtype - float

 df = pd.DataFrame() df['category'] = ['a', 'b', 'b', 'b', 'c', 'a', 'b', 'c', 'c', 'a'] df['date'] = ['2018-01-01', '2018-01-01', '2018-01-03', '2018-01-05', '2018-01-01', '2018-01-02', '2018-01-06', '2018-01-03', '2018-01-04','2018-01-01'] df['values'] = [1, 2, -1.5, 2.3, 5, -0.7, -5.2, -5.2, 1, -1.1] df

Dataframe view

I want to filter for rows that have a positive value and a negative value (with the least difference) close to that date per category.

So, essentially an output that looks like:

df = pd.DataFrame()
df['category'] = ['a', 'a','b', 'b', 'c', 'c']
df['date'] = ['2018-01-01', '2018-01-01', '2018-01-01', '2018-01-03', '2018-01-01', '2018-01-03']
df['values'] = [1, -1.1, 2, -1.5, 5, -5.2]
df

Filtered Dataframe

I have looked at similar queries on SO ( Identifying closest value in a column for each filter using Pandas , How do I find the closest values in a Pandas series to an input number? )

The first one utilises idxmin , which returns first occurence, not the closest in value.

The second link is speaking about a specific value as an input - I don't think a pure np.argsort works in my case.

I can imagine using a complex web of if statements to do this, but, I'm not sure what the most efficient way of doing this is with pandas.

Any guidance would be greatly appreciated.

Answer 1

IIUC, sort your dataframe first then use idxmin :

df1 = df.sort_values(['category','date'])
df1[df1.groupby('category')['values']\
       .transform(lambda x: x.index.isin([x.ge(0).idxmin(), x.lt(0).idxmin()]))]

Output:

  category        date  values
0        a  2018-01-01     1.0
9        a  2018-01-01    -1.1
1        b  2018-01-01     2.0
2        b  2018-01-03    -1.5
4        c  2018-01-01     5.0
7        c  2018-01-03    -5.2

Pandas filter dataframe for positive and negative values

Question

1 answers

solution1
0 ACCPTED 2018-09-17 13:03:59

Pandas filter dataframe for positive and negative values

Question

1 answers

solution1 0 ACCPTED 2018-09-17 13:03:59

solution1
0 ACCPTED 2018-09-17 13:03:59