[英]Merge multiple rows to one row in a csv file using python pandas
I have a csv file with multiple rows as stated below我有一个包含多行的 csv 文件,如下所述
Id Name Marks1 Marks2 Marks3 Marks4 Marks5
1 ABC 10 NAN NAN NAN NAN
2 BCD 15 NAN NAN NAN NAN
3 CDE 17 NAN NAN NAN NAN
1 ABC NAN 18 NAN 17 NAN
2 BCD NAN 10 NAN 15 NAN
1 ABC NAN NAN 16 NAN NAN
3 CDE NAN NAN 19 NAN NAN
I want to merge the rows having the same id and name into a single row using pandas in python.我想在 python 中使用 Pandas 将具有相同 id 和 name 的行合并为一行。 The output should be :
输出应该是:
Id Name Marks1 Marks2 Marks3 Marks4 Marks5
1 ABC 10 18 16 17 NAN
2 BCD 15 10 NAN 15 NAN
3 CDE 17 NAN 19 NAN NAN
IIUC, DataFrame.groupby
+ as_index=False
with GroupBy.first
to eliminate NaN
. IIUC,
DataFrame.groupby
+ as_index=False
与GroupBy.first
消除NaN
。
#df = df.replace('NAN',np.nan) #If necessary
df.groupby(['Id','Name'],as_index=False).first()
if you think it could have a pair Id Name with non-null values in some column you could use GroupBy.apply
with Series.ffill
and Series.bfill
+ DataFrame.drop_duplicates
to keep all the information.如果您认为在某些列中可以有一对 Id Name 和非空值,您可以使用
GroupBy.apply
和Series.ffill
和Series.bfill
+ DataFrame.drop_duplicates
来保留所有信息。
df.groupby(['Id','Name']).apply(lambda x: x.ffill().bfill()).drop_duplicates()
Output输出
Id Name Marks1 Marks2 Marks3 Marks4 Marks5
0 1 ABC 10 18 16 17 NaN
1 2 BCD 15 10 NaN 15 NaN
2 3 CDE 17 NaN 19 NaN NaN
Hacky answer:哈奇回答:
pd.groupby(“Name”).mean().reset_index()
This will only work if for each column there is only one valid value for each Name.这仅适用于每一列的每个名称只有一个有效值的情况。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.