简体   繁体   English

删除 CSV 文件中的行正在添加额外的行

[英]Removing lines in CSV file is adding extra lines

I am working on a coding assignment where one of the requirements of the app is to be able to remove lines of interest in the CSV file.我正在处理一项编码任务,其中应用程序的要求之一是能够删除 CSV 文件中感兴趣的行。 When I try to remove the line that is identified by the key (name), it not only removes the line but also adds multiple copies of my first line to my CSV file.当我尝试删除由键(名称)标识的行时,它不仅会删除该行,还会将我的第一行的多个副本添加到我的 CSV 文件中。 I can't seem to figure out why it is adding these repetitive lines.我似乎无法弄清楚为什么要添加这些重复的行。 Any help is appreciated.任何帮助表示赞赏。

For reference: attractions is a list of dictionaries that the csv file was copied into供参考:景点是csv文件复制到的字典列表

The delete function is below删除 function 如下

name = entername()

with open('boston.csv', 'r') as csv_read:
    reader = csv.reader(csv_read)
    for row in reader:
        attractions.append(row)
        for field in row:
            if field == name:
               attractions.remove(row)

with open('boston.csv', 'w') as csv_write:
    writer = csv.writer(csv_write)
    writer.writerows(attractions)

and my CSV file before looks like this:而我之前的 CSV 文件看起来像这样:

Short Name,Name,Category,URL,Lat,Lon,Color
harvard,Harvard University,university,https://www.harvard.edu/,42.373032,-71.116661,green
mit,Massachusetts Institute of Technology,University,https://www.mit.edu/,42.360092,-71.094162,green
science,Museum of Science,Tourism,https://www.mos.org/,42.36932,-71.07151,green
children,Boston Children's Museum,Tourism,https://bostonchildrensmuseum.org/,42.3531,-71.04998,green

but results in this:但结果是:

Short Name,Name,Category,URL,Lat,Lon,Color
Short Name,Name,Category,URL,Lat,Lon,Color
Short Name,Name,Category,URL,Lat,Lon,Color
Short Name,Name,Category,URL,Lat,Lon,Color
Short Name,Name,Category,URL,Lat,Lon,Color
harvard,Harvard University,university,https://www.harvard.edu/,42.373032,-71.116661,green
science,Museum of Science,Tourism,https://www.mos.org/,42.36932,-71.07151,green
children,Boston Children's Museum,Tourism,https://bostonchildrensmuseum.org/,42.3531,-71.04998,green

I've run your code and it appears to work.我已经运行了你的代码,它似乎可以工作。

I modified it to not overwrite the input file (which is very helpful when debugging), to print a message when a row is removed, and to hard-code the name (again, just for debugging):我将其修改为不覆盖输入文件(这在调试时非常有用),在删除一行时打印一条消息,并对名称进行硬编码(同样,仅用于调试):

import csv

name = 'Harvard University'

attractions = []
with open('boston.csv', 'r') as csv_read:
    reader = csv.reader(csv_read)
    for row in reader:
        attractions.append(row)
        for field in row:
            if field == name:
                print(f'{field} matches {name}, removing {row}')
                attractions.remove(row)

with open('output.csv', 'w', newline='') as f:
    writer = csv.writer(f)
    writer.writerows(attractions)

when I run that, I see these debug print messages:当我运行它时,我看到这些调试打印消息:

Harvard University matches Harvard University, removing ['harvard', 'Harvard University', 'university', 'https://www.harvard.edu/', '42.373032', '-71.116661', 'green']

and this is my output.csv :这是我的output.csv

Short Name,Name,Category,URL,Lat,Lon,Color
mit,Massachusetts Institute of Technology,University,https://www.mit.edu/,42.360092,-71.094162,green
science,Museum of Science,Tourism,https://www.mos.org/,42.36932,-71.07151,green
children,Boston Children's Museum,Tourism,https://bostonchildrensmuseum.org/,42.3531,-71.04998,green

When I change name to name = 'Tourism' , which is valid with your logic (even if it isn't what you want/intend), it still does what you'd expect, remove the two rows where Tourism is in the Category field:当我将 name 更改为name = 'Tourism'时,这对您的逻辑有效(即使它不是您想要的/不打算的),它仍然会按照您的预期进行,删除TourismCategory中的两行场地:

...
name = 'Tourism'

attractions = []
...
Tourism matches Tourism, removing ['science', 'Museum of Science', 'Tourism', 'https://www.mos.org/', '42.36932', '-71.07151', 'green']
Tourism matches Tourism, removing ['children', "Boston Children's Museum", 'Tourism', 'https://bostonchildrensmuseum.org/', '42.3531', '-71.04998', 'green']
Short Name,Name,Category,URL,Lat,Lon,Color
harvard,Harvard University,university,https://www.harvard.edu/,42.373032,-71.116661,green
mit,Massachusetts Institute of Technology,University,https://www.mit.edu/,42.360092,-71.094162,green

There's a pure python convtools library which generates the code under the hood and provides lots of data processing primitives:有一个纯 python convtools库,它在后台生成代码并提供大量数据处理原语:

from convtools import conversion as c
from convtools.contrib.tables import Table

name = entername()

table = Table.from_csv("boston.csv")  # pass header=True if it's there
columns = table.columns
table.filter(
    c.not_(
        c.or_(*(c.col(column_name) == name for column_name in columns))
        if len(columns) > 1
        else c.col(columns[0]) == name
    )
).into_csv("boston_output.csv")

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM