Join columns in a single Pandas DataFrame

Question

I've DataFrame with 4 columns and want to merge the first 3 columns in a new DataFrame.

The data is identical, the order is irrelevant and any duplicates must remain.

import pandas as pd 
   
data = [['tom', 'nick', 'john', 10], ['bob', 'jane', 'nick', 15]] 

df = pd.DataFrame(data, columns = ['col1', 'col2', 'col3','col4'])

Desired DataFrame

+-----+-----+
|col_a|col_b|
+-----+-----+
|tom  |10   |
|nick |10   |
|john |10   |
|bob  |15   |
|jane |15   |
|nick |15   |
+-----+-----+

How do I get this done?

Answer 1

Here is one way of merging the first three columns with the help of numpy :

a = df.values
pd.DataFrame({'col_a': np.ravel(a[:, :3]), 'col_b': np.repeat(a[:, 3], 3)})

  col_a col_b
0   tom    10
1  nick    10
2  john    10
3   bob    15
4  jane    15
5  nick    15

Join columns in a single Pandas DataFrame

Question

1 answers

solution1
4 ACCPTED 2021-03-13 16:28:18

Join columns in a single Pandas DataFrame

Question

1 answers

solution1 4 ACCPTED 2021-03-13 16:28:18

solution1
4 ACCPTED 2021-03-13 16:28:18