简体   繁体   中英

Pandas pivoted dataframe and multi-column Boolean comparison

I have a pivoted dataframe of the form

      Price             Units  
Buyer     B     G     S     B   G   S
Idx                                  
1         0  1.51     0     0  11   0
2      2.32  1.32     0    21  13   0
3         0     0  1.44     0   0  14

I am trying to do create another major column called "Flag" with B, G, S sub-columns using the logic that can be thought of as (cell-by-cell)

p['Flag'] = (p['Price'] < 2.0) & (p['Units'] > 13.5)

So the desired result (showing only the new columns)

       Flag
Buyer     B     G     S     
Idx                                  
1     False False False
2     False False False
3     False False  True

I have tried quite a few ways and the following comes closer than others

newp = p.join(((p['Price'] < 2.0) & (p['Units'] > 13.5)).rename(columns=dict(Price='Flag')))

but this has two issues

  1. The boolean output is incorrect for the bottom right corner. It should be true since the corresponding cell price is less than 2.0 and the corresponding cell units is more than 13.5.
  2. It gives the warning "UserWarning: merging between different levels can give an unintended result (2 levels on the left, 1 on the right)". I can't seem to get the major column name "Flag" into the dataframe.

Any ideas on fixing the Boolean conditions and merging at the correct level?

The code for generating the initial dataframe is

from collections import OrderedDict
import pandas as pd

table = OrderedDict((
    ("Idx", [1, 2, 2, 3]),
    ('Buyer',['G', 'B', 'G', 'S']),
    ('Price',  ['1.51', '2.32', '1.32', '1.44']),
    ('Units',   ['11', '21', '13', '14'])
))
d = pd.DataFrame(table)
p = d.pivot(index='Idx', columns='Buyer')
p.fillna(0, inplace=True)

I think you need convert string numbers to float by astype and then use concat :

p = p.astype(float)

newp = pd.concat([p['Price'], p['Units'], (p['Price'] < 2.0) & (p['Units'] > 13.5)], 
                 axis=1, 
                 keys=['Price','Units','Flag'])
print (newp)

      Price             Units               Flag              
Buyer     B     G     S     B     G     S      B      G      S
Idx                                                           
1      0.00  1.51  0.00   0.0  11.0   0.0  False  False  False
2      2.32  1.32  0.00  21.0  13.0   0.0  False  False  False
3      0.00  0.00  1.44   0.0   0.0  14.0  False  False   True

Solution with join and MultiIndex.from_product for create new level :

p = p.astype(float)

a = (p['Price'] < 2.0) & (p['Units'] > 13.5)
a.columns = pd.MultiIndex.from_product([['Flag'],a.columns])
p = p.join(a)
print (p)
      Price             Units               Flag              
Buyer     B     G     S     B     G     S      B      G      S
Idx                                                           
1      0.00  1.51  0.00   0.0  11.0   0.0  False  False  False
2      2.32  1.32  0.00  21.0  13.0   0.0  False  False  False
3      0.00  0.00  1.44   0.0   0.0  14.0  False  False   True

Use double brackets on 'Price' to preserve the multi-index and logically combine with 'Units' after having removed the first level of the multi-index. This way, the level that is left naturally combines with the 2nd level of the multi-index from 'Price'

Enough talk. Observe:

p[['Price']].lt(2) & p.Units.gt(13.5)

       Price              
Buyer      B      G      S
Idx                       
1      False  False  False
2      False  False  False
3      False  False   True

Now all that's left is to rename 'Price' and join

p.join(
    (
        p[['Price']].lt(2) & p.Units.gt(13.5)
    ).rename(columns=dict(Price='Flag'))
)

      Price             Units               Flag              
Buyer     B     G     S     B     G     S      B      G      S
Idx                                                           
1      0.00  1.51  0.00   0.0  11.0   0.0  False  False  False
2      2.32  1.32  0.00  21.0  13.0   0.0  False  False  False
3      0.00  0.00  1.44   0.0   0.0  14.0  False  False   True

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM