Here is a snippet of a dataframe I'm trying to analyze. What I want to do is simply subtract FP_FLOW FORMATTED_ENTRY values from D8_FLOW FORMATTED_ENTRY values only if the X_LOT_NAME is the same. For example, in the X_LOT_NAME column you can see MPACZX2. The D8_FLOW FORMATTED_ENTRY is 12.3%. The FP_FLOW FORMATTED_ENTRY value is 7.8%. The difference between the two would be 4.5%. I want to apply this logic across the whole data set
Is this what you are looking for?
df.groupby(['x_lot'])['value'].diff()
0 NaN
1 NaN
2 -5.0
3 8.0
4 -3.0
5 NaN
6 -10.0
Name: value, dtype: float64
This is the data i used to get the above results
x_lot type value
0 mpaczw1 fp 21
1 mpaczw2 d8 12
2 mpaczw2 fp 7
3 mpaczw2 d8 15
4 mpaczw2 fp 12
5 mpaczw3 d8 21
6 mpaczw3 fp 11
it is advisable to first convert your data into a format where the values to be added / subtracted are in the same row, and after that subtract / add the corresponding oclumns. You can do this using pd.pivot-table
. The below example will demonstrate this using a sample dataframe similar to what you've shared:
wanted_data
X_LOT_NAME SPEC_TYPE FORMATTED_ENTRY
0 a FP_FLOW 1
1 a D8_FLOW 2
2 c FP_FLOW 3
3 c D8_FLOW 4
pivot_data = pd.pivot_table(wanted_data,values='FORMATTED_ENTRY',index='X_LOT_NAME',columns='SPEC_TYPE')
pivot_data
SPEC_TYPE D8_FLOW FP_FLOW
X_LOT_NAME
a 2 1
c 4 3
After this step, the resultant pivot_data
contains the same data, but the columns are D8_FLOW
and FP_FLOW
, with X_LOT_NAME
as the index. Now you can get the intended value in a new column using:
pivot_data['DIFF'] = pivot_data['D8_FLOW'] - pivot_data['FP_FLOW']
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.