[英]Is there any way to simplify the If, elif statements inside the for loop in my code below?
在代码下面找到代码,该代码可从excel文件计算每个城市的订户,客户和其他客户的总数,并计算每个城市的平均旅行时间。 有什么方法可以简化下面代码中for循环内的If,elif语句?
new_file = {'Washington': './data/Washington-2016-Summary.csv',
'Chicago': './data/Chicago-2016-Summary.csv',
'NYC': './data/NYC-2016-Summary.csv'}
for city, filename in new_file.items():
with open (filename, 'r') as fil_1:
t_subscriber = 0
t_customers = 0
cnt_subscribers = 0
cnt_customers = 0
other_customers = 0
file_reader = csv.DictReader(fil_1)
for row in data_reader:
if row['user_type'] == 'Subscriber':
cnt_subscribers += 1
t_subscribers += float(row['duration'])
elif row['user_type'] == 'Customer':
cnt_customers += 1
t_customers += float(row['duration'])
elif row['user_type'] == '':
other_customers += 1
t_customers += float(row['duration'])
tripaverage_duration = (t_subscribers+t_customers)/60)/(cnt_subscribers+cnt_customers+other_customers)
tripaverage_subscribers = (t_subscribers/60)/cnt_subscribers
tripaverage_subscribers = (t_customers/60)/cnt_customers
print ('Average trip duration in', city,'-'
,tripaverage_duration,'minutes')
print ('Average trip duration for subscribers in', city,'-'
,tripaverage_subscribers,'minutes')
print ('Average trip duration for customers in', city,'-'
,tripaverage_subscribers,'minutes')
print ('\n')
我建议使用类似这样的Pandas数据框 。 您可以轻松地根据另一列中的值对数据框进行子集,并对值求和,对数字进行计数等。这是如何将其应用于问题的示例:
import pandas as pd
new_file = {'Washington': './data/Washington-2016-Summary.csv',
'Chicago': './data/Chicago-2016-Summary.csv',
'NYC': './data/NYC-2016-Summary.csv'}
for city, filename in new_file.items():
data = pd.read_csv(filename)
tripaverage_duration = data.values.mean()['duration']
tripaverage_subscribers = data[data['user_type']=='Subscriber'].values.mean()['duration']
tripaverage_customers = data[data['user_type']=='Customer'].values.mean()['duration']
print ('Average trip duration in', city,'-'
,tripaverage_duration,'minutes')
print ('Average trip duration for subscribers in', city,'-'
,tripaverage_subscribers,'minutes')
print ('Average trip duration for customers in', city,'-'
,tripaverage_subscribers,'minutes')
print ('\n')
一种选择是使用像这样的列表理解:
cnt_subscribers = sum([1 for row in data_reader if row['user_type'] == 'Subscriber'])
t_subscribers = sum([float(row['duration']) for row in data_reader if row['user_type'] == 'Subscriber'])
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.