简体   繁体   English

有没有办法只命名 Pandas read_csv 中的某些列?

[英]Is there a way to name only certain columns in Pandas read_csv?

I know it's possible to name columns when using DataFrame.read_csv() in pandas by passing the optional names = ['X', 'Y', 'Z', ...] parameter.我知道在 pandas 中使用DataFrame.read_csv()时可以通过传递可选names = ['X', 'Y', 'Z', ...]参数来命名列。 However, my question is can you name only the first X columns and the rest get autonamed?但是,我的问题是您可以只命名前 X 列,然后自动命名 rest 吗?

Basically, I have a csv with 23 columns that I want to name, and a further 1023 columns that I need to keep in the DataFrame but don't care about what they're called.基本上,我有一个 csv,其中有 23 列我想命名,还有 1023 列我需要保留在 DataFrame 中,但不关心它们的名称。 Here's an image to illustrate the requirement:这是说明要求的图像:

DataFrame 显示需要重命名的列

I don't see a setting in pandas to do this, so I just generated a list of column column names and rename the columns in the DataFrame.我没有在 pandas 中看到执行此操作的设置,所以我只是生成了列列名称列表并重命名了 DataFrame 中的列。

This will work even if you don't know how many columns to expect at the end即使您不知道最后会有多少列,这也会起作用

Dynamically Rename Columns in Data Frame动态重命名数据框中的列

import pandas

#Read file
myFile =  pandas.read_csv("C:\\python_work_area\\TestFile.csv",header=None)

#Set known column names
arr_colName = ["MyColName1","MyColName2","MyColName3"]

numOfUnkownCols = len(myFile.columns) - len(arr_colName)
#Generate array of numbers, 1 for each unknown column. Could hard code numOfUnkownCols if column count is known
arr_nums = list(range(1,numOfUnkownCols+1))

#Add numbered unnamed column names to arr_colName
for i in arr_nums:
    arr_colName.append("UnnamedColumn" + str(i))

#Rename column names. inplace = true renames the columns in the existing object, rather than generating a copy 
myFile.set_axis(arr_colName, axis=1, inplace=True)
print (myFile)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM