I have this table in pandas
| id | date | freq| year | c1| c2| c3|
0 |C35600010| 20080922| A| 2004 | d20040331| NaN| NaN|
1 |C35600010| 20080922| A| 2004 | NaN| s2003| NaN|
2 |C35600010| 20080922| A| 2004 | NaN| NaN| s3|
3 |C35600010| 20080922| Q| 2004 | NaN| NaN| s3|
4 |C35600010| 20080923| A| 2004 | NaN| NaN| s3|
and I want to merge it into
| id| date | freq| year | c1| c2| c3|
0 |C35600010| 20080922| A| 2004 | d20040331| s2003| s3|
1 |C35600010| 20080922| Q| 2004 | NaN| NaN| s3|
2 |C35600010| 20080923| A| 2004 | NaN| NaN| s3|
Basically where id, date, freq & year are same, merge the rows. It is guaranteed that only one NaN value will exist. Anyway to do it?
I tried Merging same-indexed rows by taking non-NaNs from all of them in pandas dataframe didn't really work as it throws error
df = df.groupby(["id", "date", "freq", "year"]).max()
ValueError: Wrong number of items passed 1, placement implies 4
Edit 1: There can be multiple dates associated with each id, same with freq & year. I don't want to merge them into single row.
When id, date, freq, year .. all are same then merge the rows for columns c1, c2, c3.
Does this do what you want?
df.groupby(["id", "date", "freq", "year"]).first().reset_index()
Output:
id date freq year c1 c2 c3
0 C35600010 20080922 A 2004 d20040331 s2003 s3
1 C35600010 20080922 Q 2004 None None s3
2 C35600010 20080923 A 2004 None None s3
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.