简体   繁体   中英

Pandas filter dataframe for positive and negative values

I have a pandas dataframe with 3 columns where:

  • Category dtype - string
  • Date dtype - datetime
  • Values dtype - float

     df = pd.DataFrame() df['category'] = ['a', 'b', 'b', 'b', 'c', 'a', 'b', 'c', 'c', 'a'] df['date'] = ['2018-01-01', '2018-01-01', '2018-01-03', '2018-01-05', '2018-01-01', '2018-01-02', '2018-01-06', '2018-01-03', '2018-01-04','2018-01-01'] df['values'] = [1, 2, -1.5, 2.3, 5, -0.7, -5.2, -5.2, 1, -1.1] df 

Dataframe view

I want to filter for rows that have a positive value and a negative value (with the least difference) close to that date per category.

So, essentially an output that looks like:

df = pd.DataFrame()
df['category'] = ['a', 'a','b', 'b', 'c', 'c']
df['date'] = ['2018-01-01', '2018-01-01', '2018-01-01', '2018-01-03', '2018-01-01', '2018-01-03']
df['values'] = [1, -1.1, 2, -1.5, 5, -5.2]
df

Filtered Dataframe

I have looked at similar queries on SO ( Identifying closest value in a column for each filter using Pandas , How do I find the closest values in a Pandas series to an input number? )

The first one utilises idxmin , which returns first occurence, not the closest in value.

The second link is speaking about a specific value as an input - I don't think a pure np.argsort works in my case.

I can imagine using a complex web of if statements to do this, but, I'm not sure what the most efficient way of doing this is with pandas.

Any guidance would be greatly appreciated.

IIUC, sort your dataframe first then use idxmin :

df1 = df.sort_values(['category','date'])
df1[df1.groupby('category')['values']\
       .transform(lambda x: x.index.isin([x.ge(0).idxmin(), x.lt(0).idxmin()]))]

Output:

  category        date  values
0        a  2018-01-01     1.0
9        a  2018-01-01    -1.1
1        b  2018-01-01     2.0
2        b  2018-01-03    -1.5
4        c  2018-01-01     5.0
7        c  2018-01-03    -5.2

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM