繁体   English   中英

从带有索引的csv文件中删除列列表

[英]Removing the list of columns from csv file with index

我有一个CSV文件,其内容如下:

0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15
1,10,19,,,,,,,,,,,,,
2,11,20,,,,,,,,,,,,,
3,12,21,,,,,,,,,,,,,
4,13,22,,,,,,,,,,,,,
5,14,23,,,,,,,,,,,,,
6,15,24,,,,,,,,,,,,,
7,16,25,,,,,,,,,,,,,
8,17,26,,,,,,,,,,,,,
9,18,27,,,,,,,,,,,,,

我需要按索引删除一些列集。

我尝试了以下代码,它没有返回预期的结果,有人可以帮助我。

import csv

def read():
    with open("test.csv", "rb") as fp_in, open("newfile.csv", "wb") as fp_out:
        reader = csv.reader(fp_in, delimiter=",")
        writer = csv.writer(fp_out, delimiter=",")
        col_list = [0,1,2,3,4,5,6,8]
        for row in reader:
            for col_item in col_list:
                print(col_item)
                del row[int(col_item)]
            writer.writerow(row)
read()

返回结果:

1,3,5,7,9,11,13,14
10,,,,,,,
11,,,,,,,
12,,,,,,,
13,,,,,,,
14,,,,,,,
15,,,,,,,
16,,,,,,,
17,,,,,,,
18,,,,,,,

问题是因为每次迭代的读者总是相同的,所以我需要删除列表中的所有列。

有人帮我一样。

所需的输出应如下所示:

7,9,10,11,12,13,14,15
,,,,,,,
,,,,,,,
,,,,,,,
,,,,,,,
,,,,,,,
.
.
.
.

准确地说,我只想删除提到的列及其值。

编辑:

一些明确的例子。

def read():
    with open("test.csv", "rb") as fp_in, open("newfile.csv", "wb") as fp_out:
        reader = csv.reader(fp_in, delimiter=",")
        writer = csv.writer(fp_out, delimiter=",")
        col_list = [0,2]
        for row in reader:
            for col_item in col_list:
                print(col_item)
                del row[int(col_item)]
            writer.writerow(row)
read()

我得到的输出:

1,2,4
v,d,q
c,s,a
s,d,d
f,x,c

预期:

1,3,4
v,s,q
c,d,a
s,f,d
f,a,c

问题是您要在col_list的每次迭代中更改行。

这应该起作用; 使用列表推导来复制没有col_list中的索引的行的副本。

def read():
    with open("test.csv", "r") as fp_in, open("newfile.csv", "w") as fp_out:
        reader = csv.reader(fp_in, delimiter=",")
        writer = csv.writer(fp_out, delimiter=",")
        col_list = [0,1,2,3,4,5,6,8]
        for row in reader:
            output = [v for (i,v) in enumerate(row) if i not in col_list]
            writer.writerow(output)

将以下内容写入newfile.csv:

7,9,10,11,12,13,14,15
,,,,,,,
,,,,,,,
,,,,,,,
,,,,,,,
,,,,,,,
,,,,,,,
,,,,,,,
,,,,,,,
,,,,,,,

你可以做这样的事情。

假设您的输入文件名为input.txt

with open('input.txt', 'r') as f:
    data = [k.split(',') for k in f.read().splitlines()]


for k in data:
    print(k[7] + ',' + ','.join(k[9:]))

而且,如果要将结果保存到文件(例如, final_file.txt )中,则可以执行以下操作:

with open("final_file.txt", 'a') as f:
    for k in data:
        f.write(k[7] + ',' + ','.join(k[9:]) + '\n')

输出:

7,9,10,11,12,13,14,15
,,,,,,,
,,,,,,,
,,,,,,,
,,,,,,,
,,,,,,,
,,,,,,,
,,,,,,,
,,,,,,,
,,,,,,,

您可以尝试使用pandas drop特定的列,然后写入csv文件:

import pandas as pd
df = pd.read_csv('test.csv')
df = df.drop(['0','1','2','3','4','5','6','8'], axis=1)
df.to_csv('newfile.csv',index=False)

newfile.csv将是:

7,9,10,11,12,13,14,15
,,,,,,,
,,,,,,,
,,,,,,,
,,,,,,,
,,,,,,,
,,,,,,,
,,,,,,,
,,,,,,,
,,,,,,,

您也可以在Pandas库中使用iloc

import pandas as pd

# load csv file
df = pd.read_csv('newfile.csv')

# store all rows + 1st, 2nd, 5th and 6th columns into another df
modified_df = df.iloc[:, [0, 1, 4, 5]] 

# print out
print(modified_df)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM