使用python pandas将csv文件中的多行合并为一行

Question

I have a csv file with multiple rows as stated below我有一个包含多行的 csv 文件，如下所述

Id  Name  Marks1 Marks2 Marks3 Marks4 Marks5
1   ABC   10     NAN    NAN    NAN    NAN
2   BCD   15     NAN    NAN    NAN    NAN
3   CDE   17     NAN    NAN    NAN    NAN
1   ABC   NAN    18     NAN    17     NAN
2   BCD   NAN    10     NAN    15     NAN
1   ABC   NAN    NAN    16     NAN    NAN
3   CDE   NAN    NAN    19     NAN    NAN

I want to merge the rows having the same id and name into a single row using pandas in python.我想在 python 中使用 Pandas 将具有相同 id 和 name 的行合并为一行。 The output should be :输出应该是：

Id  Name  Marks1 Marks2 Marks3 Marks4 Marks5
1   ABC   10     18     16     17     NAN
2   BCD   15     10     NAN    15     NAN
3   CDE   17     NAN    19     NAN    NAN

Answer 1

IIUC, DataFrame.groupby + as_index=False with GroupBy.first to eliminate NaN . IIUC， DataFrame.groupby + as_index=False与GroupBy.first消除NaN 。

#df = df.replace('NAN',np.nan) #If necessary
df.groupby(['Id','Name'],as_index=False).first()

if you think it could have a pair Id Name with non-null values in some column you could use GroupBy.apply with Series.ffill and Series.bfill + DataFrame.drop_duplicates to keep all the information.如果您认为在某些列中可以有一对 Id Name 和非空值，您可以使用GroupBy.apply和Series.ffill和Series.bfill + DataFrame.drop_duplicates来保留所有信息。

df.groupby(['Id','Name']).apply(lambda x: x.ffill().bfill()).drop_duplicates()

Output输出

   Id Name Marks1 Marks2 Marks3 Marks4  Marks5
0   1  ABC     10     18     16     17     NaN
1   2  BCD     15     10    NaN     15     NaN
2   3  CDE     17    NaN     19    NaN     NaN

Answer 2

Hacky answer:哈奇回答：

pd.groupby(“Name”).mean().reset_index()

This will only work if for each column there is only one valid value for each Name.这仅适用于每一列的每个名称只有一个有效值的情况。

使用python pandas将csv文件中的多行合并为一行

问题描述

2 个解决方案

解决方案1
3 已采纳 2020-01-15 08:00:36

解决方案2
0 2020-01-15 07:58:15

使用python pandas将csv文件中的多行合并为一行

问题描述

2 个解决方案

解决方案1 3 已采纳 2020-01-15 08:00:36

解决方案2 0 2020-01-15 07:58:15

解决方案1
3 已采纳 2020-01-15 08:00:36

解决方案2
0 2020-01-15 07:58:15