How to calculate the difference between columns in python?

Question

I have a pandas dataframe with the following values:

source	ACCESS CREATED	TERMS SIGNED	BUREAU
Facebook	12	8	6
Google	160	136	121
Email	29	26	25

This is a small example of this dataframe, actually there are more rows and columns in them, but maybe for example it should help.

The outuput (in percentage):

source	TERMS SIGNED	BUREAU
Facebook	33.33	25.00
Google	15.00	11.03
Email	10.34	3.85

How to create a new dataframe with the difference (in percentage) from one column to another, for example: COLUMN A: 12, COLUMN B: 8, so the difference in this step is 33.33%, and from COLUMN C: 6, and the difference from B to C is 25%.

UPDATE

I found this solution:

def percentage_change(col1,col2):
    return ((col2 - col1) / col1) * 100

SO, How can I iterate this for all my columns?

Answer 1

Here's what I would suggest.

I defined three functions:

pairwise_percent_difference() computes percentage differences for each pair of columns which are next to one another. It takes a parameter, columns_excluded , which is the number of columns at the beginning to ignore. I set this to 1, since you want to ignore the source column.
column_difference() is called once for every pair of columns. This step is pretty easy because we can subtract two columns, and Pandas will automatically turn that into subtracting each element of the two columns, one row at a time.
pairwise() is a function which accepts an iterable (eg a list of Pandas column names) and emits an iterable with each pair of neighboring elements in a tuple together. It is copied from the itertools documentation.

Code:

import pandas as pd
import itertools
import io  # Only used for loading example dataframe

s = """source,ACCESS CREATED,TERMS SIGNED,BUREAU
Facebook,12,8,6
Google,160,136,121
Email,29,26,25"""

df = pd.read_csv(io.StringIO(s))

def pairwise(iterable):
    # pairwise('ABCDEFG') --> AB BC CD DE EF FG
    a, b = itertools.tee(iterable)
    next(b, None)
    return zip(a, b)

def column_difference(col1, col2):
    return (col1 - col2) / col1 * 100

def pairwise_percent_difference(df, columns_excluded=0):
    output_df = df.iloc[:, :columns_excluded].copy()
    cols_to_compare = df.columns[columns_excluded:]
    for col1, col2 in pairwise(cols_to_compare):
        output_df[col2] = column_difference(df[col1], df[col2])
    return output_df

print(pairwise_percent_difference(df, 1))

Output:

     source   TERMS SIGNED      BUREAU
0  Facebook       33.333333  25.000000
1    Google       15.000000  11.029412
2     Email       10.344828   3.846154

How to calculate the difference between columns in python?

Question

1 answers

solution1
0 ACCPTED 2022-04-11 01:12:15

How to calculate the difference between columns in python?

Question

1 answers

solution1 0 ACCPTED 2022-04-11 01:12:15

solution1
0 ACCPTED 2022-04-11 01:12:15