CSV文件中的两列被读取为单列。 python 2.7

Question

i have 2 CSV files. 我有2个CSV文件。 i want each element in list A to get matched with every element in the list B. list A acts as training set and the list B has error which get fixed after getting matched using edit distance. 我希望列表A中的每个元素都与列表B中的每个元素匹配。列表A充当训练集，列表B具有错误，使用编辑距离进行匹配后，该错误将得到修复。

the problem is there are two columns in B. first column has unique numbers and second column has the string to be fixed. 问题是B中有两列。第一列具有唯一编号，第二列具有固定的字符串。

im getting the output as : 即时通讯输出为：

628227teitARMTEteke : iQIARMTEMAC
628226iQIARMTEMAC 9 : iQIARMTEMAC
628229iQIAConfigCH : iQIAConfigCH
627701iQIAConfigCH : iQIAConfigCH

but i want my output to be: 但我希望我的输出是：

628227 : teitARMTEteke : iQIARMTEMAC
628226 : iQIARMTEMAC 9 : iQIARMTEMAC
628229 : iQIAConfigCH : iQIAConfigCH
627701 : iQIAConfigCH : iQIAConfigCH

CODE 码

import csv
from nltk.metrics import distance


with open("all_correct_promo.csv","rb") as file1:
    reader1 = csv.reader(file1)
    correctPromoList = [''.join(i) for i in reader1]
   # print correctPromoList
with open("all_extracted_promo3.csv","rb") as file2:
    reader2 = csv.reader(file2)
    extractedPromoList = [''.join(i) for i in reader2]
    #print extractedPromoList

incorrectPromo = {}
count = 0
for extracted in extractedPromoList:
    #print 'Computing %dth promo code...' % count
    incorrectPromo[extracted] =  find_min_edit(extracted,correctPromoList) # get comma separated str of real promo codes nearest to extracted
    count+=1
#print incorrectPromo


for key, value in incorrectPromo.iteritems():
    print key ,':', value

Right now the unique numbers are getting read with the strings which will effect the way the string get corrected. 现在，字符串将读取唯一数字，这将影响字符串的更正方式。 i want the numbers to be displayed with its string but without effecting the way the string is getting matched with the strings in list A. 我希望数字与其字符串一起显示，但不影响字符串与列表A中的字符串匹配的方式。

sample from all_extracted_promo3.csv 来自all_extracted_promo3.csv的样本

628229  iQIABundUPGR
628229  iQIAPortUPGR
628229  iQIAConfigCH
628229  iQIARMTEMAC 9

sample from all_correct_promo.csv 来自all_correct_promo.csv的样本

iQ BundleUPGR
IQ MANAGED
IQ04 BRP
IQ1MOBILSUP
IQ2MOBILSUP
iQBundIeUPGR
iQBundle 1
iQBundle 2

Answer 1

Leaving aside a strange way of getting the data - to say the least - that you use, I'll answer strictly about csv.reader . 抛开您使用的一种奇怪的数据获取方式（至少可以说），我将严格回答csv.reader 。

For csv.reader to distinguish columns, you need to set up its dialect in accordance with your .csv . 为了使csv.reader能够区分列，您需要根据.csv设置其dialect 。 As its docs say, it accepts all invividual dialect formatting parameters as keyword arguments. 如其文档所述，它接受所有非生命方言格式设置参数作为关键字参数。 Here, you're probably interested in delimeter : 在这里，您可能对delimeter感兴趣：

csv.reader(<file>,delimeter=<whatever>)

Judging by the excerpts, your all_extracted_promo3.csv has two spaces for delimiter, and all_correct_promo.csv uses a single space. 从摘录来看，您的all_extracted_promo3.csv有两个空格用于定界符，而all_correct_promo.csv使用一个空格。 csv.Reader only supports single-character delimiters though : csv.Reader仅支持单字符定界符：

>>> [i for i in csv.reader(open("all_extracted_promo3.csv","rb"),delimiter=' ')]
[['628229', '', 'iQIABundUPGR'],
 ['628229', '', 'iQIAPortUPGR'],
 ['628229', '', 'iQIAConfigCH'],
 ['628229', '', 'iQIARMTEMAC', '9']]

So you'll have to either get around that (by ignoring the 2nd element), change the software that produces the file - eg to use the standard comma as delimiter - or use some other facility to parse the file. 因此，您将不得不解决该问题（忽略第二个元素），更改生成文件的软件-例如，使用标准逗号作为分隔符-或使用其他某种功能来解析文件。

CSV文件中的两列被读取为单列。 python 2.7

问题描述

1 个解决方案

解决方案1
1 2016-11-30 07:29:24

CSV文件中的两列被读取为单列。 python 2.7

问题描述

1 个解决方案

解决方案1 1 2016-11-30 07:29:24

解决方案1
1 2016-11-30 07:29:24