[英]Print the count of lines written into CSV file
[Python3]我有一個腳本讀取(長)CSV文件,包含電子郵件地址和相應的國家/地區代碼,並按國家/地區代碼拆分。 這很好,但我希望腳本根據每個文件打印出行數(即電子郵件)(已寫入)。
此外,我是編程和Python的新手,所以我很高興收到任何優化建議或其他一般提示!
輸入文件看起來像這樣:
12345@12345.com us
xyz@xyz.com gb
aasdj@ajsdf.com fr
askdl@kjasdf.com de
sdlfj@aejf.com nl
... ...
輸出應該如下所示:
Done!
us: 20000
gb: 20000
de: 10000
fr: 10000
nl: 10000
...
我的代碼如下:
import csv, datetime
from collections import defaultdict
"""
Script splits a (long) list of email addresses with associated country codes by country codes.
Input file should have only two columns of data - ideally.
"""
# Declaring variables
emails = defaultdict(list)
in_file = "test.tsv" # Write filename here.
filename = in_file.split(".")
"""Checks if file is comma or tab separated and sets delimiter variable."""
if filename[1] == "csv":
delimiter = ','
elif filename[1] == "tsv":
delimiter = '\t'
"""Reads csv/tsv file and cleans email addresses."""
with open(in_file, 'r') as f:
reader = csv.reader(f, delimiter=delimiter)
for row in reader:
# Gets rid of empty rows
if row:
# Gets rid of non-emails
if '@' in row[0]:
# Strips the emails from whitespace and appends to the 'emails' list
# Also now 'cc' is in the first position [0] and email in the second [1]
emails[row[1].strip()].append(row[0].strip()+'\n')
""""Outputs the emails by cc and names the file."""
for key, value in emails.items():
# Key is 'cc' and value is 'email'
# File is named by "today's date-original file's name-cc"
with open('{0:%Y%m%d}-{1}-{2}.csv'.format(datetime.datetime.now(), filename[0], key), 'w') as f:
f.writelines(value)
要獲得您想要的輸出,您需要打印密鑰(您的國家/地區代碼)和值的長度(您的電子郵件列表),如下所示:
""""Outputs the emails by cc and names the file."""
for key, value in emails.items():
# Key is 'cc' and value is 'email'
# File is named by "today's date-original file's name-cc"
with open('{0:%Y%m%d}-{1}-{2}.csv'.format(datetime.datetime.now(), filename[0], key), 'w') as f:
f.writelines(value)
# The file is closed (de-indented from the with), but we're still in the for loop
# Use the format() method of a string to print in the form `cc: number of emails`
print(`{}: {}`.format(key, len(value)))
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.