Pandas: create columns with tuples as labels from unique pairs of row values

Question

Imagine a df like this:

timestamp	data_point_1	data_point_2	some_data
2021/06/24	a	b	2
2021/06/24	c	d	3
2021/06/25	c	d	3

I want to change it to a df like this, that has tuples of unique value pairs of column data_point1 and data_point2 and only have the some_data column value for each timestamp :

timestamp	(a,b)	(c,d)
2021/06/24	2	3
2021/06/25	NaN	3

Here's the example data snippet:

import pandas as pd

test = pd.DataFrame({'timestamp': ["2021/06/24", "2021/06/24", "2021/06/25"], 'data_point_1': ["a", "c", "c"], 'data_point_2': ["b", "d", "d"], 'some_data': [2, 3, 3]})

print(test)
#    timestamp data_point_1 data_point_2  some_data
# 0  2021/06/24            a            b          2
# 1  2021/06/24            c            d          3
# 2  2021/06/25            c            d          3

# desired:
#    timestamp   (a,b)       (c,d)
# 0  2021/06/24    2           3
# 1  2021/06/25    0           3

Thanks :)

Answer 1

Use DataFrame.pivot with convert MultiIndex values to tuples:

df = test.pivot('timestamp', ['data_point_1','data_point_2'], 'some_data')
df.columns = [tuple(x) for x in df.columns]
df = df.reset_index()
print (df)
    timestamp  (a, b)  (c, d)
0  2021/06/24     2.0     3.0
1  2021/06/25     NaN     3.0

If need aggregate values, it means there are duplicates per timestamp, data_point_1, data_point_2 use DataFrame.pivot_table with some aggregate function like mean :

#if need aggregate values
#df = test.pivot_table(index='timestamp', 
                       columns=['data_point_1','data_point_2'], 
                       values='some_data', 
                       aggfunc='mean')
df.columns = [tuple(x) for x in df.columns]
df = df.reset_index()

Pandas: create columns with tuples as labels from unique pairs of row values

Question

1 answers

solution1
2 ACCPTED 2021-06-24 11:26:55

Pandas: create columns with tuples as labels from unique pairs of row values

Question

1 answers

solution1 2 ACCPTED 2021-06-24 11:26:55

solution1
2 ACCPTED 2021-06-24 11:26:55