[英]python csv copy column
I have a file containing following 我有一个包含以下内容的文件
first_name,last_name,uid,email,dep_code,dep_name
john,smith,jsmith,jsmith@gmail.com,finance,21230
john,king,jking,jjing@gmail.com,human resource,31230
I want to copy column "email" and create a new column "email2" and then replace gmail.com from column email2 to hotmail.com 我想复制“ email”列并创建一个新列“ email2”,然后将gmail.com从email2列替换为hotmail.com
I'm new to python so need help from experts, I tried few script, but if there is a better way to do it then please let me know. 我是python的新手,因此需要专家的帮助,我尝试了一些脚本,但是如果有更好的方法,请告诉我。 The original file contains 60000 rows. 原始文件包含60000行。
with open('c:\\Python27\\scripts\\colnewfile.csv', 'rb') as fp_in1, open('c:\\Python27\\scripts\\final.csv', 'wb') as fp_out1:
writer1 = csv.writer(fp_out1, delimiter=",")
reader1 = csv.reader(fp_in1, delimiter=",")
domain = "@hotmail.com"
for row in reader1:
if row[2:3] == "uid":
writer1.append("Email2")
else:
writer1.writerow(row+[row[2:3]])
Here is the final script, only problem is that it does not complete the entire outfile, it only show 61409 rows, whereas in the input file there are 61438 rows. 这是最终的脚本,唯一的问题是它不能完成整个输出文件,只显示61409行,而在输入文件中有61438行。
inFile = 'c:\\Python27\\scripts\\in-093013.csv' outFile = 'c:\\Python27\\scripts\\final.csv' inFile ='c:\\ Python27 \\ scripts \\ in-093013.csv'outFile ='c:\\ Python27 \\ scripts \\ final.csv'
with open(inFile, 'rb') as fp_in1, open(outFile, 'wb') as fp_out1: writer = csv.writer(fp_out1, delimiter=",") reader = csv.reader(fp_in1, delimiter=",") for col in reader: del col[6:] writer.writerow(col) headers = next(reader) writer.writerow(headers + ['email2']) for row in reader: if len(row) > 3: email = email.split('@', 1)[0] + '@hotmail.com' writer.writerow(row + [email]) 使用open(inFile,'rb')作为fp_in1,open(outFile,'wb')作为fp_out1:writer = csv.writer(fp_out1,delimiter =“,”)reader = csv.reader(fp_in1,delimiter =“,” )代表阅读器中的col:del col [6:] writer.writerow(col)headers = next(reader)writer.writerow(headers + ['email2'])代表阅读器中的行:如果len(row)> 3:电子邮件= email.split('@',1)[0] +'@ hotmail.com'writer.writerow(row + [email])
If you call next()
on the reader you get one row at at a time; 如果在阅读器上调用next()
,则一次只能获得一行; use that to copy over the headers. 用它来复制标题。 Copying the email column is easy enough: 复制电子邮件列很容易:
import csv
infilename = r'c:\Python27\scripts\colnewfile.csv'
outfilename = r'c:\Python27\scripts\final.csv'
with open(infilename, 'rb') as fp_in, open(outfilename, 'wb') as fp_out:
reader = csv.reader(fp_in, delimiter=",")
headers = next(reader) # read first row
writer = csv.writer(fp_out, delimiter=",")
writer.writerow(headers + ['email2'])
for row in reader:
if len(row) > 3:
# make sure there are at least 4 columns
email = row[3].split('@', 1)[0] + '@hotmail.com'
writer.writerow(row + [email])
This code splits the email address on the first @
sign, takes the first part of the split and adds @hotmail.com
after it: 此代码在第一个@
符号上拆分电子邮件地址,在拆分的第一部分进行拆分,并在其后添加@hotmail.com
:
>>> 'example@gmail.com'.split('@', 1)[0]
'example'
>>> 'example@gmail.com'.split('@', 1)[0] + '@hotmail.com'
'example@hotmail.com'
The above produces: 上面产生:
first_name,last_name,uid,email,dep_code,dep_name,email2
john,smith,jsmith,jsmith@gmail.com,finance,21230,jsmith@hotmail.com
john,king,jking,jjing@gmail.com,human resource,31230,jjing@hotmail.com
for your sample input. 为您的样本输入。
This can be done very cleanly using pandas . 可以使用熊猫很干净地完成此操作。 Here it goes: 它去了:
In [1]: import pandas as pd
In [3]: df = pd.read_csv('your_csv_file.csv')
In [4]: def rename_email(row):
...: return row.email.replace('gmail.com', 'hotmail.com')
...:
In [5]: df['email2'] = df.apply(rename_email, axis=1)
In [6]: """axis = 1 or ‘columns’: apply function to each row"""
In [7]: df
Out[7]:
first_name last_name uid email dep_code dep_name email2
0 john smith jsmith jsmith@gmail.com finance 21230 jsmith@hotmail.com
1 john king jking jjing@gmail.com human resource 31230 jjing@hotmail.com
In [8]: df.to_csv('new_update_email_file.csv')
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.