繁体   English   中英

删除 CSV 文件中的行正在添加额外的行

[英]Removing lines in CSV file is adding extra lines

我正在处理一项编码任务,其中应用程序的要求之一是能够删除 CSV 文件中感兴趣的行。 当我尝试删除由键(名称)标识的行时,它不仅会删除该行,还会将我的第一行的多个副本添加到我的 CSV 文件中。 我似乎无法弄清楚为什么要添加这些重复的行。 任何帮助表示赞赏。

供参考:景点是csv文件复制到的字典列表

删除 function 如下

name = entername()

with open('boston.csv', 'r') as csv_read:
    reader = csv.reader(csv_read)
    for row in reader:
        attractions.append(row)
        for field in row:
            if field == name:
               attractions.remove(row)

with open('boston.csv', 'w') as csv_write:
    writer = csv.writer(csv_write)
    writer.writerows(attractions)

而我之前的 CSV 文件看起来像这样:

Short Name,Name,Category,URL,Lat,Lon,Color
harvard,Harvard University,university,https://www.harvard.edu/,42.373032,-71.116661,green
mit,Massachusetts Institute of Technology,University,https://www.mit.edu/,42.360092,-71.094162,green
science,Museum of Science,Tourism,https://www.mos.org/,42.36932,-71.07151,green
children,Boston Children's Museum,Tourism,https://bostonchildrensmuseum.org/,42.3531,-71.04998,green

但结果是:

Short Name,Name,Category,URL,Lat,Lon,Color
Short Name,Name,Category,URL,Lat,Lon,Color
Short Name,Name,Category,URL,Lat,Lon,Color
Short Name,Name,Category,URL,Lat,Lon,Color
Short Name,Name,Category,URL,Lat,Lon,Color
harvard,Harvard University,university,https://www.harvard.edu/,42.373032,-71.116661,green
science,Museum of Science,Tourism,https://www.mos.org/,42.36932,-71.07151,green
children,Boston Children's Museum,Tourism,https://bostonchildrensmuseum.org/,42.3531,-71.04998,green

我已经运行了你的代码,它似乎可以工作。

我将其修改为不覆盖输入文件(这在调试时非常有用),在删除一行时打印一条消息,并对名称进行硬编码(同样,仅用于调试):

import csv

name = 'Harvard University'

attractions = []
with open('boston.csv', 'r') as csv_read:
    reader = csv.reader(csv_read)
    for row in reader:
        attractions.append(row)
        for field in row:
            if field == name:
                print(f'{field} matches {name}, removing {row}')
                attractions.remove(row)

with open('output.csv', 'w', newline='') as f:
    writer = csv.writer(f)
    writer.writerows(attractions)

当我运行它时,我看到这些调试打印消息:

Harvard University matches Harvard University, removing ['harvard', 'Harvard University', 'university', 'https://www.harvard.edu/', '42.373032', '-71.116661', 'green']

这是我的output.csv

Short Name,Name,Category,URL,Lat,Lon,Color
mit,Massachusetts Institute of Technology,University,https://www.mit.edu/,42.360092,-71.094162,green
science,Museum of Science,Tourism,https://www.mos.org/,42.36932,-71.07151,green
children,Boston Children's Museum,Tourism,https://bostonchildrensmuseum.org/,42.3531,-71.04998,green

当我将 name 更改为name = 'Tourism'时,这对您的逻辑有效(即使它不是您想要的/不打算的),它仍然会按照您的预期进行,删除TourismCategory中的两行场地:

...
name = 'Tourism'

attractions = []
...
Tourism matches Tourism, removing ['science', 'Museum of Science', 'Tourism', 'https://www.mos.org/', '42.36932', '-71.07151', 'green']
Tourism matches Tourism, removing ['children', "Boston Children's Museum", 'Tourism', 'https://bostonchildrensmuseum.org/', '42.3531', '-71.04998', 'green']
Short Name,Name,Category,URL,Lat,Lon,Color
harvard,Harvard University,university,https://www.harvard.edu/,42.373032,-71.116661,green
mit,Massachusetts Institute of Technology,University,https://www.mit.edu/,42.360092,-71.094162,green

有一个纯 python convtools库,它在后台生成代码并提供大量数据处理原语:

from convtools import conversion as c
from convtools.contrib.tables import Table

name = entername()

table = Table.from_csv("boston.csv")  # pass header=True if it's there
columns = table.columns
table.filter(
    c.not_(
        c.or_(*(c.col(column_name) == name for column_name in columns))
        if len(columns) > 1
        else c.col(columns[0]) == name
    )
).into_csv("boston_output.csv")

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM