I have imported a .csv file in my python program which contains a number of columns using pandas module. In my code, I just imported the first three columns. The code and the sample file are as follows.
import pandas as pd
fields = ['TEST ONE', 'TEST TWO', 'TEST THREE']
df1=pd.read_csv('List.csv', skipinitialspace=True, usecols=fields)
sample file
How can I find the difference of the columns TEST ONE and TEST TWO in my python program and store it in separate place/column/array inside the code so that the values can be extracted from it whenever needed. I want to find the mean and the maximum value of the new column which is generated as the difference of the first two columns.
Do something like this.
df1['diff'] = df1['TEST ONE'] - df1['TEST TWO']
#The Dataframe would be df1 throughout
# This will store it as a column of that same dataframe.
# When you need the difference, use that column just like normal pandas column.
mean_of_diff = df1['diff'].mean()
max_of_diff = df1['diff'].max()
# For third value of difference use the third index of dataframe
third_diff = df1.loc[2, 'diff']
Note: I have used 2 as index starts from 0. Also index can be a string or date as well. Pass approrpriate index value to get your desired result.
Difference = df1['TEST ONE'] - df['TEST TWO']
Difference will be pandas series. on that you can use mean and max
Difference.mean()
Difference.max()
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.