简体   繁体   中英

Pandas dropping columns and rows from a dataframe that came from Excel

I am trying to drop some useless columns in a dataframe but I am getting the error: "too many indices for array"

Here is my code :

import pandas as pd
def answer_one():
    energy = pd.read_excel("Energy Indicators.xls")
    energy.drop(energy.index[0,1], axis = 1)
answer_one()

Option 1
Your syntax is wrong when slicing the index and it should be the columns

import pandas as pd

energy = pd.read_excel("Energy Indicators.xls")
energy.drop(energy.columns[[0,1]], axis=1)

Option 2
I'd do it like this

import pandas as pd

energy = pd.read_excel("Energy Indicators.xls")
energy.iloc[:, 2:]

我认为在解析/读取E​​xcel文件时最好跳过不需要的列:

energy = pd.read_excel("Energy Indicators.xls", parse_cols='C:ZZ')

If you're trying to drop the column need to change the syntax. You can refer to them by the header or the index. Here is how you would refer to them by name.

import pandas as pd

energy = pd.read_excel("Energy Indicators.xls")
energy.drop(['first_colum', 'second_column'], axis=1, inplace=True)

Another solution would be to exclude them in the first place:

energy = pd.read_excel("Energy Indicators.xls", usecols=[2:])

This will help speed up the import as well.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM