I've DataFrame with 4 columns and want to merge the first 3 columns in a new DataFrame.
The data is identical, the order is irrelevant and any duplicates must remain.
import pandas as pd
data = [['tom', 'nick', 'john', 10], ['bob', 'jane', 'nick', 15]]
df = pd.DataFrame(data, columns = ['col1', 'col2', 'col3','col4'])
Desired DataFrame
+-----+-----+
|col_a|col_b|
+-----+-----+
|tom |10 |
|nick |10 |
|john |10 |
|bob |15 |
|jane |15 |
|nick |15 |
+-----+-----+
How do I get this done?
Here is one way of merging the first three columns with the help of numpy
:
a = df.values
pd.DataFrame({'col_a': np.ravel(a[:, :3]), 'col_b': np.repeat(a[:, 3], 3)})
col_a col_b
0 tom 10
1 nick 10
2 john 10
3 bob 15
4 jane 15
5 nick 15
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.