简体   繁体   English

删除 .csv 文件中的行 (Python)

[英]Deleting Rows in a .csv File (Python)

Good evening, I'm having a problem with a code I'm writing, and I would love to get advice.晚上好,我正在编写的代码有问题,我很想得到建议。 I want to do the following:我想做以下事情:

  1. Remove rows in a .csv file that contain a specific value (-3.4028*10^38)删除 .csv 文件中包含特定值的行 (-3.4028*10^38)
  2. Write a new .csv写一个新的 .csv

The file I'm working with is large (12.2 GB, 87 million rows), and has 6 columns within it, with the first 5 columns being numerical values, and the last value containing text.我正在处理的文件很大(12.2 GB,8700 万行),其中有 6 列,前 5 列是数值,最后一个值包含文本。

Here is my code:这是我的代码:

import csv

directory = "/media/gman/Folder1/processed/test_removal1.csv"
with open('run1.csv', 'r') as fin, open(directory, 'w', newline='') as fout:

# define reader and writer objects
reader = csv.reader(fin, skipinitialspace=False)
writer = csv.writer(fout, delimiter=',')

# write headers
writer.writerow(next(reader))

# iterate and write rows based on condition
for i in reader:
    if (i[-1]) == -3.4028E38:
        writer.writerow(i)

When I run this I get the following error message:当我运行它时,我收到以下错误消息:

Error: line contains NUL错误:行包含 NUL

File "/media/gman/Aerospace_Classes/Programs/csv_remove.py", line 19, in <module>
for i in reader: Error: line contains NUL 

I'm not sure how to proceed.我不知道如何继续。 If anyone has any suggestions, please let me know.如果有人有任何建议,请告诉我。 Thank you.谢谢你。

I figured it out.我想到了。 Here is what I ended up doing:这是我最终做的:

#IMPORT LIBRARIES
import pandas as pd

#IMPORT FILE PATH
directory = '/media/gman/Grant/Maps/processed_maps/csv_combined.csv'

#CREATE DATAFRAME FROM IMPORTED CSV
data = pd.read_csv(directory)
data.head()
data.drop(data[data.iloc[:,2] < -100000].index, inplace=True) #remove rows that contain altitude values greater than -100,000 meters.
# this is to remove the -3.402823E038 meter altitude values that keep coming up.

#CONVERT PROCESSED DATAFRAME INTO NEW CSV FILE
df = data.to_csv(r'/media/gman/Grant/Maps/processed_maps/corrected_altitude_data.csv') #export good data to this file.

I went with pandas to remove rows based on a logic argument, this made a dataframe.我和熊猫一起根据逻辑参数删除了行,这构成了一个数据框。 I then exported the dataframe into a csv file.然后我将数据框导出到一个 csv 文件中。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM