I have data frame in which I have duplicates values (in each column not duplicated rows). Data look like that:
|Col1|Col2|Cold3|Col4|
| 1| A| John| -10|
| 2| A|Scoot| 234|
| 2| B|Kerry| 346|
| 6| B| Adam| -10|
I would like to create another df from this one which would look like that:
|Col1|Col2|Cold3|Col4|
| 1| A| John| -10|
| 2| B|Scoot| 234|
| 6|null|Kerry| 346|
|null|null| Adam|null|
Those null could be NaN of course.
I can go by each column and print unique values for each:
for col in df:
print (df[col].unique())
which returns numpy arrays. But I'm not sure how to write it to new data frame to look like one that I showed erlier.
I think you need:
df = df.apply(lambda x: pd.Series(x.unique()))
print (df)
Col1 Col2 Cold3 Col4
0 1.0 A John -10.0
1 2.0 B Scoot 234.0
2 6.0 NaN Kerry 346.0
3 NaN NaN Adam NaN
Or:
df = df.apply(lambda x: pd.Series(x.drop_duplicates().values))
print (df)
Col1 Col2 Cold3 Col4
0 1.0 A John -10.0
1 2.0 B Scoot 234.0
2 6.0 NaN Kerry 346.0
3 NaN NaN Adam NaN
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.