pd.read_csv creates a multi-index dataframe if I have blank header entries

Question

I have a csv where not all column headers are specified.

temp.csv reads,

a, b
1, 2, 3, 4
5, 6, 7, 8

When I try to read this with pandas, i get a multi-index dataframe.

pd.read_csv('temp.csv')

produces the output,

        a   b
1   2   3   4
5   6   7   8

What I want is for the [1, 5] column header to be 'a', and the [2, 6] column to be 'b'. Explicitly setting index_col=None does not fix the problem. Any ideas?

Edit: Thanks ALollz. I modified your answer slightly so I only read the file once. (I'll be reading a lot of files.)

df = pd.read_csv('temp.csv')
names = df.columns.tolist()
df.reset_index(inplace=True)
df.columns = names + [i for i in range(df.shape[1] - len(names))]

Answer 1

You can ignore the broken header with a combination of header=0 and the names you want to specify:

pd.read_csv('temp.csv', header=0, names=['a', 'b', 'col1', 'col2'])
#   a  b  col1  col2
#0  1  2     3     4
#1  5  6     7     8

If you don't want to manually specify things you can read the first row to use the headers and then figure out how many other names you need to supply.

names = pd.read_csv('temp.csv', nrows=1)
names = names.columns.tolist() + [f'col{i}' for i in range(1, df.shape[1] - len(names))]

df = pd.read_csv('temp.csv', header=0, names=names)

pd.read_csv creates a multi-index dataframe if I have blank header entries

Question

1 answers

solution1
1 ACCPTED 2020-03-12 20:01:54

pd.read_csv creates a multi-index dataframe if I have blank header entries

Question

1 answers

solution1 1 ACCPTED 2020-03-12 20:01:54

solution1
1 ACCPTED 2020-03-12 20:01:54