简体   繁体   English

如何在熊猫数据框中交换两列并翻转第三列?

[英]how to swap two columns and flip a third in panda data frame?

I'm conducting an experiment(using python 2.7, panda 0.23.4) where I have three levels of a stimulus {a,b,c} and present all different combinations to participants, and they have to choose which one was rougher?我正在进行一个实验(使用 python 2.7,panda 0.23.4),其中我有三个级别的刺激 {a,b,c} 并向参与者展示所有不同的组合,他们必须选择哪个更粗糙? (example: Stimulus 1 = a , Stimulus 2=b, participant choose 1 indicating stimulus 1 was rougher) (例如:刺激 1 = a ,刺激 2=b,参与者选择 1 表示刺激 1 更粗糙)

After the experiment, I have a data frame with three columns like this:实验后,我有一个包含三列的数据框,如下所示:

import pandas as pd

d = {'Stim1':  ['a', 'b', 'a', 'c', 'b', 'c'],
     'Stim2': ['b', 'a', 'c', 'a', 'c', 'b'],
     'Answer': [1, 2, 2, 1, 2, 1]}
df = pd.DataFrame(d)

        Stim1  Stim2  Answer
    0     a     b      1
    1     b     a      2
    2     a     c      2
    3     c     a      1
    4     b     c      2
    5     c     b      1 

For my analysis, the order of which stimulus came first doesn't matter.对于我的分析,刺激先出现的顺序并不重要。 Stim1= a, Stim2= b is the same as Stim1= b, Stim2= a. Stim1= a, Stim2= b 与 Stim1= b, Stim2= a 相同。 I'm trying to figure out how can I swap Stim1 and Stim2 and flip their Answer to be like this:我想弄清楚如何交换 Stim1 和 Stim2 并将它们的答案翻转为这样:

        Stim1  Stim2  Answer
    0     a     b      1
    1     a     b      1
    2     a     c      2
    3     a     c      2
    4     b     c      2
    5     b     c      2

I read that np.where can be used, but it would do one thing at a time, where I want to do two (swap and flip).我读到np.where可以使用,但它一次只做一件事,我想做两件事(交换和翻转)。

Is there some way to use another function to do swap and flip at the same time?有没有办法使用另一个函数同时进行交换和翻转?

Can you try if this works for you?你能试试这对你有用吗?

import pandas as pd
import numpy as np

df = pd.DataFrame(d)

# keep a copy of the original Stim1 column
s = df['Stim1'].copy()

# sort the values
df[['Stim1', 'Stim2']] = np.sort(df[['Stim1', 'Stim2']].values)

# exchange the Answer if the order has changed
df['Answer'] = df['Answer'].where(df['Stim1'] == s, df['Answer'].replace({1:2,2:1}))

output:输出:

  Stim1 Stim2  Answer
0     a     b       1
1     a     b       1
2     a     c       2
3     a     c       2
4     b     c       2
5     b     c       2

You can start by building a boolean series that indicates which rows should be swapped or not:您可以首先构建一个布尔系列,指示应交换或不交换哪些行:

>>> swap = df['Stim1'] > df['Stim2']
>>> swap
0    False
1     True
2    False
3     True
4    False
5     True
dtype: bool

Then build the fully swapped dataframe as follows:然后构建完全交换的数据帧,如下所示:

>>> swapped_df = pd.concat([
...     df['Stim1'].rename('Stim2'),
...     df['Stim2'].rename('Stim1'),
...     3 - df['Answer'],
... ], axis='columns')
>>> swapped_df
  Stim2 Stim1  Answer
0     a     b       2
1     b     a       1
2     a     c       1
3     c     a       2
4     b     c       1
5     c     b       2

Finally, use .mask() to select initial rows or swapped rows:最后,使用.mask()选择初始行或交换行:

>>> df.mask(swap, swapped_df)
  Stim1 Stim2  Answer
0     a     b       1
1     a     b       1
2     a     c       2
3     a     c       2
4     b     c       2
5     b     c       2

NB .mask is roughly the same as .where , but it replaces rows where the series is True instead of keeping the rows that are True . NB .mask是大致相同的.where ,但它代替行,其中该系列产品是True的而不是保留了那些行True This is exactly the same:这是完全一样的:

>>> swapped_df.where(swap, df)
  Stim2 Stim1  Answer
0     b     a       1
1     b     a       1
2     c     a       2
3     c     a       2
4     c     b       2
5     c     b       2

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM