简体   繁体   中英

Create a pandas dataframe column based on a logical test

I have a dataframe containing the following columns:

amount1 - a numeric value
amount2 - a different numeric value
ccy1 - a 3-char currency code
ccy2 - a different 3-char currency code

The data is organised such that there are rows where the tuple (amount1,ccy1,amount2,ccy2) will correspond exactly with another row consisting of the tuple (amount2,ccy2,amount1,ccy1)

What I want to do is split my dataframe into two. In df1 I want to include those rows where ccy1 >= ccy2 (sorted alphabetically) and in df2, I want to include those rows where cc1 < ccy2.

I wrote a simple function that does the splitting:

def splitfunctest(s1, s2):
   if s1 > s2:
      return 'BIG'
   else:
      return 'SMALL'

But am having trouble applying it to my new column I am trying :

df['splitter'] = splitfunctest(df['ccy1'], df['ccy2'])

but get:

Traceback (most recent call last): File "", line 1, in File "", line 2, in splitfunctest ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

So I can see that the function is trying (and failing) to evaluate the entire contents of each field passed to it - So how do I get it to function atomically? - any help would be greatly appreciated.

Try this:

df1 = df[df['ccy1'] >= df['ccy2']]
df2 = df[df['ccy1'] < df['ccy2']]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM