简体   繁体   English

使用Pandas在条件上乘以两个数字列

[英]Multiply two numerical columns on conditional using Pandas

I have pd dataframe (data) with three columns, X, Y and Z. 我有pd数据帧(数据),有三列,X,Y和Z.

I need to run the following: 我需要运行以下内容:

X * Y where Z = 'value' X * Y其中Z ='值'

I'm working along the lines of: 我正在努力:

data[data['Z'] == 'value',[data['X']*data['Y']]]

Now I know that this isn't correct, but I can smell the correct answer. 现在我知道这不正确,但我能闻到正确的答案。 Can someone point me in the right direction? 有人能指出我正确的方向吗?

IIUC: IIUC:

(df.X * df.Y).where(df.Z == 'Value')

or 要么

df[df.Z == 'Value'].eval('X * Y')

Examples: 例子:

np.random.seed(123)
df = pd.DataFrame({'X':np.arange(10),'Y':np.arange(10),'Z':np.random.choice(['Value',np.nan],10)})

(df.X * df.Y).where(df.Z == 'Value')

0     0.0
1     NaN
2     4.0
3     9.0
4    16.0
5    25.0
6    36.0
7     NaN
8     NaN
9    81.0
dtype: float64

Or 要么

df[df.Z == 'Value'].eval('X * Y')

0     0
2     4
3     9
4    16
5    25
6    36
9    81
dtype: int32

Setup 设定
Borrowed from @ScottBoston 借用@ScottBoston

np.random.seed(123)
df = pd.DataFrame({
    'X':np.arange(10),
    'Y':np.arange(10),
    'Z':np.random.choice(['Value',np.nan],10)
})

Solution

df.loc[df.Z.eq('Value'), ['X', 'Y']].prod(1) 

0     0
2     4
3     9
4    16
5    25
6    36
9    81
dtype: int64
data.loc[data['Z'] == 'value', 'Z'] = data.loc[data['Z'] == 'value', 'X'] * data.loc[data['Z'] == 'value', 'Y']

Here's a working example: 这是一个有效的例子:

dataframe = pd.DataFrame({'X': [1, 2, 3, 4, 5, 6],
                          'Y': [5, 6, 7, 8, 9, 0],
                          'Z': [0, 1, 0, 1, 0, 1]})

dataframe.loc[dataframe['Z'] == 0, 'Z'] = dataframe.loc[dataframe['Z'] == 0, 'X'] * dataframe.loc[dataframe['Z'] == 0, 'Y']

print(dataframe)

#    X  Y   Z
# 0  1  5   5
# 1  2  6   1
# 2  3  7  21
# 3  4  8   1
# 4  5  9  45
# 5  6  0   1

i think you want somenthing like this: 我想你想要像这样闷闷不乐:

import pandas as pd
import numpy as np

df_original = pd.DataFrame({'X': [1, 2, 3, 4, 5, 6],
                            'Y': [7, 8, 9, 10, 11, 12],
                            'Z': [False, True, True, True, False, False]})

df_original['X*Y'] = np.where(df_original.Z == True, df_original.X * df_original.Y, df_original.Z)
#In this case True or False are the conditios or "Value", but you can put any value you want.

Output: 输出:

   X   Y      Z  X*Y
0  1   7  False    0
1  2   8   True   16
2  3   9   True   27
3  4  10   True   40
4  5  11  False    0
5  6  12  False    0

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM