如何从合并单元格中使用 pandas 将 excel 列数据提取到 python 列表中

[英]How to extract excel column data into python list using pandas from merged cell

i'm trying to extract 'Country' column data into python list using pandas.我正在尝试使用 pandas 将“国家”列数据提取到 python 列表中。 Below the code i used to.在我以前的代码下面。 Also attached excel sheet and output.还附上 excel 板和 output。


from pandas import DataFrame
import pandas as pd
open_file = pd.read_excel('data.xlsx', sheet_name=0)
df = list(open_file['Country'])

Output: Output:

[nan, 'Great Britain', 'China ', 'Russia', 'United States', 'Korea', 'Japan', 'Germany']

Process finished with exit code 0进程以退出代码 0 结束

In the output i can see 'nan' because in the sheet two cells are merged into one.在 output 中,我可以看到“nan”,因为在工作表中,两个单元格合并为一个。 How to avoid this?如何避免这种情况?

enter image description here

Use header=1 and then you can use it with unnamed:0 or 1 or 2 to get column values to list使用 header=1 然后您可以将其与 unnamed:0 或 1 或 2 一起使用以获取要列出的列值

import pandas as pd

df = pd.read_excel('data.xlsx', sheet_name=0, header=1)
print(df['Unnamed: 0'].to_list())

Try this尝试这个

df = pd.read_excel('data.xlsx', header[0,1])
df = df.rename(columns=lambda x: x if not 'Unnamed' in str(x) else '')

Now the headers are in the form of tuples.现在标题是元组的形式。 For ex, to access Country or Column Gold , you need to write something like below statements例如,要访问Country或 Column Gold ,您需要编写类似以下语句的内容

print(df[('Country', '')])
print(df[('Media Tally', 'Gold')])

