简体   繁体   English

python csv复制列

[英]python csv copy column

I have a file containing following 我有一个包含以下内容的文件

first_name,last_name,uid,email,dep_code,dep_name
john,smith,jsmith,jsmith@gmail.com,finance,21230
john,king,jking,jjing@gmail.com,human resource,31230

I want to copy column "email" and create a new column "email2" and then replace gmail.com from column email2 to hotmail.com 我想复制“ email”列并创建一个新列“ email2”,然后将gmail.com从email2列替换为hotmail.com

I'm new to python so need help from experts, I tried few script, but if there is a better way to do it then please let me know. 我是python的新手,因此需要专家的帮助,我尝试了一些脚本,但是如果有更好的方法,请告诉我。 The original file contains 60000 rows. 原始文件包含60000行。

with open('c:\\Python27\\scripts\\colnewfile.csv', 'rb') as fp_in1, open('c:\\Python27\\scripts\\final.csv', 'wb') as fp_out1:
    writer1 = csv.writer(fp_out1, delimiter=",")
    reader1 = csv.reader(fp_in1, delimiter=",")
    domain = "@hotmail.com"
    for row in reader1:
        if row[2:3] == "uid":
            writer1.append("Email2")
        else:
            writer1.writerow(row+[row[2:3]])

Here is the final script, only problem is that it does not complete the entire outfile, it only show 61409 rows, whereas in the input file there are 61438 rows. 这是最终的脚本,唯一的问题是它不能完成整个输出文件,只显示61409行,而在输入文件中有61438行。

inFile = 'c:\\Python27\\scripts\\in-093013.csv' outFile = 'c:\\Python27\\scripts\\final.csv' inFile ='c:\\ Python27 \\ scripts \\ in-093013.csv'outFile ='c:\\ Python27 \\ scripts \\ final.csv'

with open(inFile, 'rb') as fp_in1, open(outFile, 'wb') as fp_out1: writer = csv.writer(fp_out1, delimiter=",") reader = csv.reader(fp_in1, delimiter=",") for col in reader: del col[6:] writer.writerow(col) headers = next(reader) writer.writerow(headers + ['email2']) for row in reader: if len(row) > 3: email = email.split('@', 1)[0] + '@hotmail.com' writer.writerow(row + [email]) 使用open(inFile,'rb')作为fp_in1,open(outFile,'wb')作为fp_out1:writer = csv.writer(fp_out1,delimiter =“,”)reader = csv.reader(fp_in1,delimiter =“,” )代表阅读器中的col:del col [6:] writer.writerow(col)headers = next(reader)writer.writerow(headers + ['email2'])代表阅读器中的行:如果len(row)> 3:电子邮件= email.split('@',1)[0] +'@ hotmail.com'writer.writerow(row + [email])

If you call next() on the reader you get one row at at a time; 如果在阅读器上调用next() ,则一次只能获得一行; use that to copy over the headers. 用它来复制标题。 Copying the email column is easy enough: 复制电子邮件列很容易:

import csv

infilename = r'c:\Python27\scripts\colnewfile.csv'
outfilename = r'c:\Python27\scripts\final.csv'

with open(infilename, 'rb') as fp_in, open(outfilename, 'wb') as fp_out:
    reader = csv.reader(fp_in, delimiter=",")
    headers = next(reader)  # read first row

    writer = csv.writer(fp_out, delimiter=",")
    writer.writerow(headers + ['email2'])

    for row in reader:
        if len(row) > 3:
            # make sure there are at least 4 columns
            email = row[3].split('@', 1)[0] + '@hotmail.com'
        writer.writerow(row + [email])

This code splits the email address on the first @ sign, takes the first part of the split and adds @hotmail.com after it: 此代码在第一个@符号上拆分电子邮件地址,在拆分的第一部分进行拆分,并在其后添加@hotmail.com

>>> 'example@gmail.com'.split('@', 1)[0]
'example'
>>> 'example@gmail.com'.split('@', 1)[0] + '@hotmail.com'
'example@hotmail.com'

The above produces: 上面产生:

first_name,last_name,uid,email,dep_code,dep_name,email2
john,smith,jsmith,jsmith@gmail.com,finance,21230,jsmith@hotmail.com
john,king,jking,jjing@gmail.com,human resource,31230,jjing@hotmail.com

for your sample input. 为您的样本输入。

This can be done very cleanly using pandas . 可以使用熊猫很干净地完成此操作。 Here it goes: 它去了:

In [1]: import pandas as pd

In [3]: df = pd.read_csv('your_csv_file.csv')

In [4]: def rename_email(row):
   ...:     return row.email.replace('gmail.com', 'hotmail.com')
   ...:

In [5]: df['email2'] = df.apply(rename_email, axis=1)

In [6]: """axis = 1 or ‘columns’: apply function to each row"""

In [7]: df
Out[7]:
  first_name last_name     uid             email        dep_code  dep_name              email2
0       john     smith  jsmith  jsmith@gmail.com         finance     21230  jsmith@hotmail.com
1       john      king   jking   jjing@gmail.com  human resource     31230   jjing@hotmail.com

In [8]: df.to_csv('new_update_email_file.csv')

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM