繁体   English   中英

如何操作多维列表python中的元素

[英]How to manipulate element in Multi-dimensional list python

更新。 如图所示,我已将数据合并到多维列表中

[('Medical Services', 2), ('Medical Services', 2), ('Physical Therapy', 5), 
 ('Physical Therapy', 1), ('Chiropractic', 0)]

是否可以得到如下结果

 The average response time for Medical Services is 2 days

(请注意,它应该删除重复项)

 The average response time for Physical Therapy is 5 days

(请注意,有两次物理治疗,但我从中获得的天数最多,即 5 天)

 The average response time for Chiropractic is less than 24 hours

(请注意,如果是 0 天,则应更改为“少于 24 小时”)

这是一个非常简单的解决方案,没有任何外部库:

data = [('Medical Services', 2), ('Medical Services', 2), ('Physical Therapy', 5), ('Physical Therapy', 1), ('Chiropractic', 0)]

from collections import defaultdict
label2days = defaultdict(list)
for label, days in data:
    label2days[label].append(day2)
for label, days in label2days.items():
    average = sum(days) / len(days)
    print("The average response time for {0} is {1}".format(
        label,
        "%.2f" % average if average >= 1 else "less than 24 hours",
    ))

印刷:

脊椎按摩疗法的平均响应时间少于 24 小时
物理治疗的平均响应时间为 3.00 天
医疗服务的平均响应时间为 2.00 天

[注意:此答案旨在解决 OP 的原始问题,该问题随后已进行修订。 ]

对于 timedelta,这可能不是您想要的,但我相信它很接近。

我建议你使用 Pandas 来解决这个问题。 假设您的 csv 数据位于名为data.csv的文件中:

import pandas as pd

# load the data into a pandas dataframe
df = pd.read_csv('data.csv', header = None, names = ['Service', 'Reported Date', 'Completion Date', 'Days'])

# convert date strings to dates
df['Reported Date'] = pd.to_datetime(df['Reported Date'])
df['Completion Date'] = pd.to_datetime(df['Completion Date'])

# calculate the timedelta and add it to the 'Days' column
df['Days'] = (df['Completion Date'] - df['Reported Date'])

# convert the timedelta to an int
df['Days'] = df['Days'].dt.days

# print our dataframe
print(df)
print

# convert the timedelta 'Days' to days
new_df = df.groupby([df['Service']]).mean()

# print our new dataframe
print(new_df)
print

for row in new_df.itertuples():
    if row[1] < 1:
        print("The average response time for %s is %s days." % (row[0], row[1]))
    else:
        print("The average response time for %s is less than 24 hours." % (row[0]))

产生以下输出:

================================================================================
Oct 4, 2017, 9:00:40 AM
untitled text 11
--------------------------------------------------------------------------------
            Service       Reported Date     Completion Date  Days
0  Medical Services 2017-03-08 09:20:00 2017-03-12 14:59:00     4
1  Medical Services 2017-03-08 09:28:00 2017-03-12 14:59:00     4
2  Physical Therapy 2017-03-04 09:34:00 2017-03-14 19:21:00    10
3  Physical Therapy 2017-03-04 09:39:00 2017-03-14 15:00:00    10
4  Medical Services 2017-03-01 09:49:00 2017-03-24 19:21:00    23
5      Chiropractic 2017-03-27 10:41:00 2017-03-27 19:22:00     0
6  Medical Services 2017-03-02 10:46:00 2017-03-04 15:00:00     2
7      Chiropractic 2017-03-27 11:36:00 2017-03-27 12:51:00     0
8  Medical Services 2017-03-02 14:17:00 2017-03-02 19:22:00     0

                  Days
Service               
Chiropractic       0.0
Medical Services   6.6
Physical Therapy  10.0

The average response time for Chiropractic is less than 24 hours.
The average response time for Medical Services is 6.6 days.
The average response time for Physical Therapy is 10.0 days.

Pandas 是一个非常强大的工具,可以让执行此类操作变得相当容易。

更新

虽然您仍然需要使用外部模块进行日期时间计算,但这里是一个仅使用日期时间的模块。 这个脚本很冗长,因此您可以看到发生了什么,但有几种方法可以合并许多步骤。

import datetime

data = []
services = []

with open('data.csv', 'r') as f:
    lines = f.readlines()

# construct list 'data' which contains the rows of data as lists
for line in lines:
    item = line.strip().split(',')
    data.append(item)

# construct list 'services' which contains a list of unique services
for item in data:
    if item[0] not in services:
        services.append(item[0])

# create a timedelta for each row, convert it to days and add it to the end of each row
for item in data:
    reported = datetime.datetime.strptime(item[1], '%m/%d/%Y %H:%M')
    delivered = datetime.datetime.strptime(item[2], '%m/%d/%Y %H:%M')
    item.append( (delivered-reported).total_seconds() )

for service in services:
    total_days = 0
    counter = 0

    for row in data:
        if service in row:
            total_days += row[3]
            counter += 1

    average_days = (total_days / counter) / 86400

    if average_days < 1:
        print("The average response time for %s is less than 24 hours." % (service))
    else:
        print("The average response time for %s is %.2f days." % (service, average_days))

输出:

================================================================================
Oct 4, 2017, 10:07:40 AM
untitled text 11
--------------------------------------------------------------------------------
The average response time for Medical Services is 6.85 days.
The average response time for Physical Therapy is 10.32 days.
The average response time for Chiropractic is less than 24 hours.

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM