简体   繁体   English

如何在 pandas dataframe 中使用字符串作为列名

[英]How to use string as column name in pandas dataframe

I've got an excel workbook that I am reading data from and and doing things with.我有一个 excel 工作簿,我正在从中读取数据并使用它来做事。 In the excel workbook, some of the column headers are numbers and I don't know how to use them in pandas.在 excel 工作簿中,一些列标题是数字,我不知道如何在 pandas 中使用它们。 I am also not allowed to change the column titles in excel (for the purposes of this project).我也不允许更改 excel 中的列标题(出于本项目的目的)。

In this case, the column headers are all the same (ex: 2008, 2008, and 2008) and are all numbers.在这种情况下,列标题都是相同的(例如:2008、2008 和 2008)并且都是数字。 This makes sense in the context of my project but is confusing to pandas and to me.这在我的项目环境中是有道理的,但让 pandas 和我感到困惑。 They are distinguished because the row above them in the excel workbook has more info.它们之所以与众不同,是因为 excel 工作簿中它们上方的行包含更多信息。

filename = 'myfile.xlsx'
data = pd.read_excel(myfile, skiprows=8)

print("Column Headings")
print(data.columns)

Results of printing the column headers (shortened list):打印列标题的结果(缩短列表):

Index([2008, '2008.1', '2008.2'], dtype='object')

Now I need to use these column names to get at the data in those columns...现在我需要使用这些列名来获取这些列中的数据......

provider_name = 'example_name'
subset_by_provider = data.loc[data['Provider'] == provider_name]

#the error is here. 2008 is the column name
data_2008 = subset_by_provider.2008.tolist() 

As I indicated above, the error is in the last line of code.正如我上面指出的,错误出现在最后一行代码中。 I am reading the data into a list.我正在将数据读入列表。 2008 (as an integer) and '2008.1' are names of the columns in my excel sheet. 2008(作为整数)和 '2008.1' 是我的 excel 表中的列的名称。 But I get a syntax error.但我得到一个语法错误。

#Doesn't work
data_2008 = subset_by_provider.2008.tolist()

#Doesn't work
data_2008 = subset_by_provider.'2008.1'.tolist()

#Does work
data_2008 = subset_by_provider.i2008.tolist()

In the 2nd line, I changed the column name in the excel sheet from 2008 to i2008, just to prove a point.在第 2 行,我将 excel 表中的列名从 2008 更改为 i2008,只是为了证明一点。 However, in practice, I am not allowed to do this.但是,在实践中,我不允许这样做。

How can I read the column name 2008 or '2008.1'?如何读取列名 2008 或“2008.1”?

As noted in the comments above.如上面评论中所述。 The solution:解决方案:

data_2008 = subset_by_provider[2008].tolist()

or

data_2008 = subset_by_provider['2008.1'].tolist()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 熊猫如何在输出字符串中使用列名 - Pandas how to use column name in output string 如果名称是空字符串,如何重命名Pandas DataFrame列? - How to rename a Pandas DataFrame column if the name is an empty string? 如何在整个Pandas数据帧中搜索字符串,并获取包含该字符串的列的名称? - How to search entire Pandas dataframe for a string and get the name of the column that contains it? 如何拆分字符串并指定为熊猫数据框的列名? - How to split a string and assign as column name for a pandas dataframe? 如何根据列名将 pandas dataframe 值转换为字符串 - How to convert pandas dataframe value to string based on column name 使用列名称作为pandas DataFrame上何处的条件 - use column name as condition for where on pandas DataFrame 我们如何在字符串格式中使用 pandas dataframe 列 - How can we use a pandas dataframe column in string formating 我如何获取两个列表 Pandas DataFrame 列名,并且只使用一个列表,但 append 一个循环中的字符串应用于列名? - How can I take two lists of Pandas DataFrame columns names, and only use one list, but append a string in a loop to be applied to the column name? 如何合并具有相同列名的Pandas DataFrame? - How to merge pandas DataFrame with same column name? 如何在 Pandas 数据框中为索引添加列名 - How to add column name for index in Pandas dataframe
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM