简体   繁体   English

使用python更改CSV文件中的列的值

[英]Change values of a column in CSV file using python

I am new to python and just need a small help. 我是python的新手,只需要一点帮助。

We have a Pipe delimited CSV file which looks like this 我们有一个以竖线分隔的CSV文件,看起来像这样

DATE|20160101
ID | Name | Address | City | State | Zip   | Phone | OPEID  | IPEDS |
10 | A... | 210 W.. | Mo.. | AL... | '31.. | 334.. | '01023 | 10063 |
20 | B... | 240 N.. | Ne.. | Ut... | '21.. | 335.. | '01024 | 10064 |

Every value of Zip and OPEID columns has apostrophes in the beginning Zip和OPEID列的每个值开头都带有撇号

So we wish to create a new CSV file where apostrophes are removed from each value of these 2 columns. 因此,我们希望创建一个新的CSV文件,其中从这2列的每个值中都删除了撇号。

The new file should then look like this: 然后,新文件应如下所示:

DATE|20160101
ID | Name | Address | City | State | Zip  | Phone | OPEID | IPEDS |
10 | A... | 210 W.. | Mo.. | AL... | 31.. | 334.. | 01023 | 10063 |
20 | B... | 240 N.. | Ne.. | Ut... | 21.. | 335.. | 01024 | 10064 |

This code works for copying data without removing apostrophes 该代码可用于复制数据而不会删除撇号

import os
import csv

file1 = "D:\CSV\File1.csv"
with open(file1, 'rb') as csvfile:

         reader = csv.reader(csvfile, delimiter = '|')

         path = "D:/CSV/New"
         if not os.path.exists(path):
             os.makedirs(path)

         writer = csv.writer(open(path+"File2"+".csv", 'wb'), delimiter = '|')

         for row in reader:
             writer.writerow(row)

csvfile.close()

The code below would be the same for all file formats. 下面的代码对于所有文件格式都是相同的。 The fact that it is a *.csv doesn't change a thing. 它是* .csv的事实不会改变任何事情。 What it actually does, is that it goes in the file from which you want to remove the apostrophes, my_csv_in , and parses it line by line each time replacing them with nothing (aka removing). 它的实际作用是,将其放入您要从中删除撇号my_csv_in ,并在每次不进行任何替换(又称为删除)时逐行对其进行解析。 The modified lines are written in a second file, my_csv_out . 修改后的my_csv_out写入第二个文件my_csv_out

my_csv_in = r'full_file_path_to_csv_in.csv'
my_csv_out = r'full_file_path_to_csv_out.csv'

with open(my_csv_in, 'r') as f_in:
    with open(my_csv_out, 'w') as f_out:
        for line in f_in:
            f_out.write(line.replace("'", ''))

There are probably better ways to do this that take advantage of the file being a *.csv and using the csv library. 可能有更好的方法来利用文件* .csv并使用csv库。 You can take a look at the quoting options in the documentation . 您可以查看文档中的quoting options

To remove apostrophes you can use the replace function , you just need to get the content of every cell one by one, and replace the apostrophes with: 要删除撇号,您可以使用replace函数 ,您只需要一个一个地获取每个单元格的内容,并用以下命令替换撇号:

new = old.replace("'", "")

More simply, open your csv file with any file editor and search and replace for "'". 更简单地说,使用任何文件编辑器打开csv文件,然后搜索并替换为“'”。

It worked for me... Try this. 对我有用...试试这个。

res=[]
with open('hi.csv') as f:
    content=csv.reader(f,delimiter='|')
    for row in content:
        for str in range (len(row)):
            row[str]=row[str].replace('\'','')
        res.append(row)
    f.close()
with open('hi.csv','wb') as ff:  # python 3 => 'wb' => 'w',newline=''
    sw=csv.writer(ff,delimiter='|',quoting=csv.QUOTE_MINIMAL)
    for rows in res:
        sw.writerow(rows)
ff.close()

You can do it very efficiently with Pandas--this will be good if your file is very large: 您可以使用Pandas非常有效地执行此操作-如果文件很大,这会很好:

import pandas as pd
import sys

with open('t.txt') as infile:
    title = next(infile)
    infile.seek(0)
    table = pd.read_csv(infile, '|', header=1, dtype=str)

table.rename(columns={'Unnamed: 9':''}, inplace=True)

table[' Zip   '] = table[' Zip   '].str.replace("'", "")
table[' OPEID  '] = table[' OPEID  '].str.replace("'", "")

sys.stdout.write(title)
table.to_csv(sys.stdout, '|', index=False)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM