[英]How to extract excel column data into python list using pandas from merged cell
i'm trying to extract 'Country' column data into python list using pandas.我正在尝试使用 pandas 将“国家”列数据提取到 python 列表中。 Below the code i used to.
在我以前的代码下面。 Also attached excel sheet and output.
还附上 excel 板和 output。
code:代码:
from pandas import DataFrame
import pandas as pd
open_file = pd.read_excel('data.xlsx', sheet_name=0)
df = list(open_file['Country'])
print(df)
Output: Output:
[nan, 'Great Britain', 'China ', 'Russia', 'United States', 'Korea', 'Japan', 'Germany']
Process finished with exit code 0进程以退出代码 0 结束
In the output i can see 'nan' because in the sheet two cells are merged into one.在 output 中,我可以看到“nan”,因为在工作表中,两个单元格合并为一个。 How to avoid this?
如何避免这种情况?
Use header=1 and then you can use it with unnamed:0 or 1 or 2 to get column values to list使用 header=1 然后您可以将其与 unnamed:0 或 1 或 2 一起使用以获取要列出的列值
import pandas as pd
df = pd.read_excel('data.xlsx', sheet_name=0, header=1)
print(df['Unnamed: 0'].to_list())
Try this尝试这个
df = pd.read_excel('data.xlsx', header[0,1])
df = df.rename(columns=lambda x: x if not 'Unnamed' in str(x) else '')
Now the headers are in the form of tuples.现在标题是元组的形式。 For ex, to access
Country
or Column Gold
, you need to write something like below statements例如,要访问
Country
或 Column Gold
,您需要编写类似以下语句的内容
print(df[('Country', '')])
print(df[('Media Tally', 'Gold')])
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.