Pandas 合并具有相同值的第一列中的单元格

Question

I want to merge the consecutive values in the first column of an Excel file and export it to another one.我想合并 Excel 文件第一列中的连续值并将其导出到另一个。 My question is pretty similar to this one but I can't get the correct output file.我的问题与这个问题非常相似，但我无法获得正确的 output 文件。

Input Excel file (Modules.xlsx)输入 Excel 文件（Modules.xlsx）

输入 Excel 文件

data = pd.read_excel(io="Modules.xlsx")
df = pd.DataFrame(data=data).set_index([data.columns[0]])
print(df)
with pd.ExcelWriter(path="excel_file.xlsx", engine="xlsxwriter") as writer:
    df.to_excel(excel_writer=writer, sheet_name="Inventories")
    old_ws = writer.sheets.get("Inventories")
    for col, val in enumerate(df.reset_index().columns):
        old_ws.write(0, col, val)

                                                   Module Name Serial Number       PID                                 Description
MGMT IP Address (Hostname)
sandbox-iosxe-latest-1.cisco.com (csr1000v-1)          Chassis   9ESGOBARV9D  CSR1000V                      Cisco CSR1000V Chassis
sandbox-iosxe-latest-1.cisco.com (csr1000v-1)        module R0   JAB1303001C  CSR1000V              Cisco CSR1000V Route Processor
sandbox-iosxe-latest-1.cisco.com (csr1000v-1)        module F0           NaN  CSR1000V  Cisco CSR1000V Embedded Services Processor
sandbox-iosxe-recomm-1.cisco.com (csr1000v-recomm)     Chassis   926V75BDNRJ  CSR1000V                      Cisco CSR1000V Chassis
sandbox-iosxe-recomm-1.cisco.com (csr1000v-recomm)   module R0   JAB1303001C  CSR1000V              Cisco CSR1000V Route Processor
sandbox-iosxe-recomm-1.cisco.com (csr1000v-recomm)   module F0           NaN  CSR1000V  Cisco CSR1000V Embedded Services Processor

The output excel_file.xlsx is exactly the same as Modules.xlsx . output excel_file.xlsx与Modules.xlsx完全相同。 What am I missing to get the excel_file.xlsx to look like the image below?为了让excel_file.xlsx看起来像下图，我缺少什么？

Pandas v1.3.4 & xlsxwriter v3.0.2 Pandas v1.3.4 & xlsxwriter v3.0.2

Excel文件

Answer 1

First, what is "d" in df = pd.DataFrame(data=data).set_index([d.columns[0]]) ?首先， df = pd.DataFrame(data=data).set_index([d.columns[0]])中的“d”是什么？

From the accepted answer to the question you linked I take that the Index must be multilevel (more than 1 index)从您链接的问题的公认答案中，我认为索引必须是多级的（超过 1 个索引）

So you would have.. .set_index(["MGMT IP Address (Hostname)", "Module Name"])所以你会有.. .set_index(["MGMT IP Address (Hostname)", "Module Name"])

Without having your data I can't check that though.如果没有您的数据，我无法检查。

Maybe this simple example holds true for your data as well:也许这个简单的例子也适用于您的数据：

import pandas as pd

data = {"A": ["a", "a", "b" ,"c" ,"d" ], "B": [2, 2, 2, 2, 1], "C":[1, 2, 3, 5, 6]}
df1 = pd.DataFrame(data=data).set_index(["A"])
df1
   B  C
A      
a  2  1
a  2  2
b  2  3
c  2  5
d  1  6

df2 = pd.DataFrame(data=data).set_index(["A","B"])
df2

     C
A B   
a 2  1
  2  2
b 2  3
c 2  5
d 1  6

Pandas 合并具有相同值的第一列中的单元格

问题描述

1 个解决方案

解决方案1
1 已采纳 2021-11-22 16:06:47

Pandas 合并具有相同值的第一列中的单元格

问题描述

1 个解决方案

解决方案1 1 已采纳 2021-11-22 16:06:47

解决方案1
1 已采纳 2021-11-22 16:06:47