Pandas Dataframe 总是创建新的索引列

Question

I am trying to append a dataframe csv_objects to another dataframe result, to get one combined dataframe. 它具有完全相同的列，以相同的方式创建，如下所示：两个数据帧都像下面的代码一样创建，一个被保存到 csv，然后再次读取，这就是我尝试将它与新创建的数据帧结合起来的时候。

  result = pd.DataFrame(data=np.reshape(self.get_data_from_object(object_id), (1, 14)),
                                        columns=("corners", "parts", "sharp", "steep",
                                                 "flat","flat_count", "over_air", "object_overhang", "bridges", "thin",
                                                 "total_area", "length_width", "length_height", "smallest_area"))

数据来自这里（都是浮点值）：（这是 get_data_from_object 的返回签名）

        return corners, parts, sharp, steep, flat, flat_count, over_air, object_overhang, bridges, thin, total_area, length_width, length_height, smallest_area

我尝试像这样组合它们：

        csv_objects.loc[csv_objects.index.size]=result.loc[0]

或像这样：


        csv_objects.append(result)

重现问题的代码：



data = [1,1,1,1,1,1,1,1,1,1,1,1,1,1]

return_array = pd.DataFrame(data=np.reshape(data, (1, 14)),
                                        columns=("corners", "parts", "sharp", "steep",
                                                 "flat","flat_count", "over_air", "object_overhang", "bridges", "thin",
                                                 "total_area", "length_width", "length_height", "smallest_area"))

 return_array.to_csv(path + "/save.csv")




 csv_objects = pd.read_csv(path + "/save.csv")

  result = pd.DataFrame(data=np.reshape(data, (1, 14)),
                                        columns=("corners", "parts", "sharp", "steep",
                                                 "flat","flat_count", "over_air", "object_overhang", "bridges", "thin",
                                                 "total_area", "length_width", "length_height", "smallest_area"))


 csv_objects.loc[csv_objects.index.size]=result.loc[0]

print(csv_objects)

但它总是创建一个新的索引列，因此生成的 dataframe 有 16 列，即使旧框架每个有 15 个（14 个值和 1 个索引），这不是我想要的。 我怎样才能防止这种情况并使它们使用相同的索引值？ 意思是，我需要新帧的第一行从旧帧的最后一个索引值开始。

当我打印奇异帧时，它看起来像这样：[1 行 x 15 列] 未命名：0 个角部分... length_width length_height minimum_area 0 0 0.0 0.0... 1.0 0.5 1.0

当我打印组合帧时，如下所示： [1 行 x 16 列] 未命名：0 未命名：0.1 ... length_height minimum_area 1 1 NaN ... 0.5 1.0

Answer 1

不确定您是否只是想组合一堆具有相同列名和顺序的 CSV 文件，但如果是这种情况，可能以下内容就足够了：

df1 = pd.read_csv("my_file.csv")
df2 = pd.read_csv("my_other_file.csv")

combined_df = pd.concat([df1, df2])

要确保 combine_df 使用单一索引（而不是单独使用 df1 和 df2 的索引），请使用reset_index ：

combined_df.reset_index(drop=True, inplace=True)

Pandas Dataframe 总是创建新的索引列

问题描述

1 个解决方案

解决方案1
0 2021-04-29 22:18:46

Pandas Dataframe 总是创建新的索引列

问题描述

1 个解决方案

解决方案1 0 2021-04-29 22:18:46

解决方案1
0 2021-04-29 22:18:46