简体   繁体   中英

overwrite column names in pandas dataframe automatically

I am trying to give all column names in a csv file dummy names which are integers from 0 to 400. However, the following code doesn't work, I get an error saying the syntax is wrong. What is my mistake?

df = pd.read_csv("df.csv", sep=',', encoding='utf-8', header=0, names = [0:400])

I think you can change header=None , add parameter skiprows=1 and omit parameter names , because read_csv add column names from 0 to ( length of columns - 1 ) by default. Parameter sep=',' is default, so can be omitted too.

Sample:

import pandas as pd
import io

temp=u"""a,b,c
1,5,7
2,7,8
3,1,9
4,8,6
1,5,3"""
#after testing replace io.StringIO(temp) to filename
df = pd.read_csv(io.StringIO(temp), header=None, skiprows=1, encoding='utf8')
print df
   0  1  2
0  1  5  7
1  2  7  8
2  3  1  9
3  4  8  6
4  1  5  3

Or change parameter names to names=range(400) , because you have 400 columns:

df = pd.read_csv(io.StringIO(temp), header=0, names=range(3), encoding='utf8')
print df
   0  1  2
0  1  5  7
1  2  7  8
2  3  1  9
3  4  8  6
4  1  5  3

它与

df = pd.read_csv("df.csv", sep=',',  encoding='utf-8', header=None, skiprows = 1)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM