简体   繁体   English

使用csv模块中的writerow从第一个列开始写入列?

[英]Using writerow from csv module to write columns starting after the first?

I'm new to python and am working on a utility to prepare some data for analysis in R. So far the utility is reading in two csvfiles, parsing urls for TLDs and SLDs, and then writing those to a csv from a transposed list. 我是python的新手,正在使用一种实用程序来准备一些数据以在R中进行分析。到目前为止,该实用程序正在读取两个csv文件,解析TLD和SLD的url,然后将它们从转置列表写入csv。 I then need to copy the other columns other columns of the csv files from "colrdr" directly into the reader as columns 2 - 6. I tried some direct solutions first, just writing from colrdr the same way I wrote from MasterList previously, but that didn't work (It seemed the colrdr columns weren't being written at all.) After reading some more documentation I also tried using append mode when creating the writer object for the appending the copied columns, but that didn't work either. 然后,我需要将其他列的csv文件的其他列从“ colrdr”作为第2-6列直接复制到阅读器中。我首先尝试了一些直接解决方案,只是从colrdr进行写入,就像以前从MasterList编写的一样。不起作用(似乎根本没有编写colrdr列。)在阅读了更多文档之后,我还在创建用于追加复制列的writer对象时尝试使用append模式,但这也不起作用。

Here are the relevant portions of the code: 以下是代码的相关部分:

Here is where I parse the data into TLD/SLDs 这是我将数据解析为TLD / SLD的地方

# Create a List for future frequency distribution
SLDList = list()
TLDList = list()
MasterList = [SLDList, TLDList]
for fl in infiles:
    with open(fl, 'r') as csvin:
        reader = csv.reader (csvin, delimiter = ',')
        reader.next()
        for row in reader:
            SLDList.append(gettld(row[urlcolumn]))
            TLDList.append(psl.get_public_suffix(row[urlcolumn]))
# Create a List of other columns of infiles
    with open(fl, 'r') as csvin:
        reader = csv.reader (csvin, delimiter = ',')
        colrdr.append(zip(*(list(reader))))

Here I'm creating a few lists which I'm using zip to translate into columns, from which I will write in the second part of the code. 在这里,我正在创建一些列表,这些列表将使用zip转换为列,然后将它们从第二部分中编写出来。

Here is the part where I'm writing that I haven't really been able to figure out 这是我写的部分,我实际上还没有弄清楚

with open(outfile, 'wb') as csvout:
    wtr = csv.writer (csvout, delimiter=',',quotechar='|', quoting=csv.QUOTE_MINIMAL)
    for row in zip(*MasterList):
        wtr.writerow(row)
with open(outfile, 'a') as csvout:
    wtr = csv.writer (csvout, delimiter=',',quotechar='|', quoting=csv.QUOTE_MINIMAL)
    for row in colrdr:
        wtr.writerow(row)

The first part of this works fine. 第一部分工作正常。 The second part does not do what I would think it should do, and unfortunately it won't even overwrite the first two files, it's as if it's just being ignored by the interpreter, and I don't quite understand why. 第二部分没有做我想做的事情,不幸的是,它甚至不会覆盖前两个文件,就好像它只是被解释器所忽略,我也不十分清楚为什么。

Any help would be appreciated, thank you! 任何帮助,将不胜感激,谢谢!

EDIT: I can't give actual samples, but the input csvfiles are files from different sources that all have the format 编辑:我不能提供实际的示例,但输入的csvfiles是来自不同来源的文件,都具有以下格式

URL, Registrar, Host, ASN URL,注册器,主机,ASN

The output should be one file which combines all the others and has the format 输出应该是一个文件,其中包含所有其他文件,并且格式为

TLD, SLD, Registrar, Host, ASN TLD,SLD,注册商,主持人,ASN

newrows = list()
for fl in infiles:
    with open(fl, 'rb') as csvin:
        reader = csv.reader(csvin, delimiter=',')
        reader.next() # skip header
        for row in reader:
            sld = gettld(row[urlcolumn])
            tld = psl.get_public_suffix(row[urlcolumn])
            newrows.append([sld, tld] + row[1:]) # row[1:] is everything but the first

with open(outfile, 'wb') as csvout:
    wtr = csv.writer(csvout, delimiter=',', quotechar='|', quoting=csv.QUOTE_MINIMAL)
    wtr.writerow(["TLD", "SLD", "Registrar", "Host", "ASN"]) # write header
    wtr.writerows(newrows)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM