如何从python中的csv文件的索引中删除双引号

Question

我正在尝试使用python读取多个csv文件。 原始数据（或第一列）的索引有一个小问题，部分csv文件如下所示：

NoDemande;"NoUsager";"Sens";"IdVehiculeUtilise";"NoConducteur";"NoAdresse";"Fait";"HeurePrevue"
42210000003;"42210000529";"+";"265Véh";"42210000032";"42210002932";"1";"25/07/2015 10:00:04"
42210000005;"42210001805";"+";"265Véh";"42210000032";"42210002932";"1";"25/07/2015 10:00:04"
42210000004;"42210002678";"+";"265Véh";"42210000032";"42210002932";"1";"25/07/2015 10:00:04"
42210000003;"42210000529";"—";"265Véh";"42210000032";"42210004900";"1";"25/07/2015 10:50:03"
42210000004;"42210002678";"—";"265Véh";"42210000032";"42210007072";"1";"25/07/2015 11:25:03"
42210000005;"42210001805";"—";"265Véh";"42210000032";"42210004236";"1";"25/07/2015 11:40:03"

第一个索引没有"" ，在读取文件后，它看起来像： "NoDemande"而其他"NoDemande"没有"" ，其余的列看起来都很好，这使得结果看起来像（不是同一行）：

"NoDemande"     NoUsager Sens IdVehiculeUtilise NoConducteur    NoAdresse Fait          HeurePrevue
42209000003  42209001975    +            245Véh  42209000002  42209005712    1   24/07/2015 06:30:04
42209000004  42209002021    +            245Véh  42209000002  42209005712    1   24/07/2015 06:30:04
42209000005  42209002208    +            245Véh  42209000002  42209005713    1   24/07/2015 06:45:04
42216000357  42216001501    -            190Véh  42216000139  42216001418    1   31/07/2015 17:15:03
42216000139  42216000788    -         309V7pVéh  42216000059  42216006210    1   31/07/2015 17:15:03
42216000118  42216000188    -            198Véh  42216000051  42216006374    1   31/07/2015 17:15:03

在接下来的动作中，在识别索引名称时会引起问题。 如何解决这个问题呢？ 这是我读取文件的代码：

import pandas as pd
import glob

pd.set_option('expand_frame_repr', False)
path = r'D:\Python27\mypfe\data_test'
allFiles = glob.glob(path + "/*.csv")
frame = pd.DataFrame()
list_ = []

for file_ in allFiles:
    #Read file
    df = pd.read_csv(file_,header=0,sep=';',dayfirst=True,encoding='utf8',
                     dtype='str')

    df['Sens'].replace(u'\u2014','-',inplace=True)

    list_.append(df)
    print"fichier lu ",file_

frame = pd.concat(list_)
print frame

Answer 1

实际上，我一直沉迷于如何从索引中删除双引号。 更改角度后，我认为最好添加一个新列，从原始列中复制值并删除它。 因此，新列将具有您想要的索引。 就我而言，我做到了：

frame['NoDemande'] = frame.ix[:, 0]
tl = frame.drop(frame.columns[0],axis=1)

所以我有了一个想要的新东西。

Answer 2

我认为最简单的方法是设置新的列名：

df.columns = ['NoDemande1'] + df.columns[1:].tolist()
print (df)
    NoDemande1     NoUsager Sens IdVehiculeUtilise  NoConducteur    NoAdresse  \
0  42210000003  42210000529    +            265Véh   42210000032  42210002932   
1  42210000005  42210001805    +            265Véh   42210000032  42210002932   
2  42210000004  42210002678    +            265Véh   42210000032  42210002932   
3  42210000003  42210000529    -           265Véh   42210000032  42210004900   
4  42210000004  42210002678    -           265Véh   42210000032  42210007072   
5  42210000005  42210001805    -           265Véh   42210000032  42210004236   

   Fait          HeurePrevue  
0     1  25/07/2015;10:00:04  
1     1  25/07/2015;10:00:04  
2     1  25/07/2015;10:00:04  
3     1  25/07/2015;10:50:03  
4     1  25/07/2015;11:25:03  
5     1  25/07/2015;11:40:03

另一种解决方案是strip值"从列名：

print (df)
   "NoDemande"     NoUsager Sens IdVehiculeUtilise  NoConducteur    NoAdresse  \
0  42210000003  42210000529    +            265Véh   42210000032  42210002932   
1  42210000005  42210001805    +            265Véh   42210000032  42210002932   
2  42210000004  42210002678    +            265Véh   42210000032  42210002932   
3  42210000003  42210000529    -           265Véh   42210000032  42210004900   
4  42210000004  42210002678    -           265Véh   42210000032  42210007072   
5  42210000005  42210001805    -           265Véh   42210000032  42210004236   

   Fait          HeurePrevue  
0     1  25/07/2015;10:00:04  
1     1  25/07/2015;10:00:04  
2     1  25/07/2015;10:00:04  
3     1  25/07/2015;10:50:03  
4     1  25/07/2015;11:25:03  
5     1  25/07/2015;11:40:03

df.columns = df.columns.str.strip('"')
print (df)
     NoDemande     NoUsager Sens IdVehiculeUtilise  NoConducteur    NoAdresse  \
0  42210000003  42210000529    +            265Véh   42210000032  42210002932   
1  42210000005  42210001805    +            265Véh   42210000032  42210002932   
2  42210000004  42210002678    +            265Véh   42210000032  42210002932   
3  42210000003  42210000529    -            265Véh   42210000032  42210004900   
4  42210000004  42210002678    -            265Véh   42210000032  42210007072   
5  42210000005  42210001805    -            265Véh   42210000032  42210004236   

   Fait          HeurePrevue  
0     1  25/07/2015;10:00:04  
1     1  25/07/2015;10:00:04  
2     1  25/07/2015;10:00:04  
3     1  25/07/2015;10:50:03  
4     1  25/07/2015;11:25:03  
5     1  25/07/2015;11:40:03

如何从python中的csv文件的索引中删除双引号

问题描述

2 个解决方案

解决方案1
0 2016-09-07 08:19:03

解决方案2
0 已采纳 2016-09-07 08:28:51

如何从python中的csv文件的索引中删除双引号

问题描述

2 个解决方案

解决方案1 0 2016-09-07 08:19:03

解决方案2 0 已采纳 2016-09-07 08:28:51

解决方案1
0 2016-09-07 08:19:03

解决方案2
0 已采纳 2016-09-07 08:28:51