根据特定列比较数据集中的行以找到最小值/最大值

Question

So I have a dataset that contains history of a specific tag from a start to end date.所以我有一个dataset ，其中包含从开始到结束日期的特定标签的历史记录。 I am trying to compare rows based on the a date column, if they're similar by month, day and year, I'll add those to a temporary list by the value of the next column and then once I have those items by similar date, I'll take that list and find the min/max values subtract them, then add the result to another list and empty the temp_list to start all over again.我正在尝试根据日期列比较行，如果它们按月、日和年相似，我将按下一列的值将它们添加到临时列表中，然后一旦我按相似获得这些项目日期，我将获取该列表并找到min/max减去它们，然后将结果添加到另一个列表并清空temp_list以重新开始。

For the sake of time and simplicity, I am just presenting a example of 2D List.为了时间和简单起见，我只是展示了一个二维列表的例子。 Here's my example data这是我的示例数据

dataset = [[1,5],[1,6],[1,10],[1,23],[2,4],[2,8],[2,12],[3,10],[3,20],[3,40],[4,50],[4,500]]

Where the first column will act as dates and second value .其中第一列将作为dates和第二个value 。

The issues I am having is :我遇到的问题是：

I cant seem to compare every row based on its first column which would take the value in the second column and include it in the temp list to perform min/max operations?我似乎无法根据第一列比较每一行，这将采用第二列中的值并将其包含在临时列表中以执行最小/最大操作？
Based on the above 2D List I would expect to get [18,8,30,450] but the result is [5,4,10]基于上面的二维列表，我希望得到[18,8,30,450]但结果是[5,4,10]

dataset = [[1,5],[1,6],[1,10],[1,23],[2,4],[2,8],[2,12],[3,10],[3,30],[3,40],[4,2],[4,5]]
temp_list = []
daily_total = []
for i in range(len(dataset)-1):
    if dataset[i][0] == dataset[i+1][0]:
        temp_list.append(dataset[i][1])
    else:
        max_ = max(temp_list)
        min_ = min(temp_list)
        total = max_ - min_
        daily_total.append(total)
        temp_list = []
            
print([x for x in daily_total])

Answer 1

Try:尝试：

tmp = {}
for d, v in dataset:
    tmp.setdefault(d, []).append(v)

out = [max(v) - min(v) for v in tmp.values()]
print(out)

Prints:印刷：

[18, 8, 30, 450]

Answer 2

Here is a solution using pandas:这是使用熊猫的解决方案：

import pandas as pd

dataset = [
    [1, 5],
    [1, 6],
    [1, 10],
    [1, 23],
    [2, 4],
    [2, 8],
    [2, 12],
    [3, 10],
    [3, 20],
    [3, 40],
    [4, 50],
    [4, 500],
]

df = pd.DataFrame(dataset)
df.columns = ["date", "value"]
df = df.groupby("date").agg(min_value=("value", "min"), max_value=("value", "max"))
df["res"] = df["max_value"] - df["min_value"]
df["res"].to_list()

Output:输出：

[18, 8, 30, 450]

根据特定列比较数据集中的行以找到最小值/最大值

问题描述

2 个解决方案

解决方案1
0 已采纳 2021-07-22 17:25:37

解决方案2
0 2021-07-22 17:41:41

根据特定列比较数据集中的行以找到最小值/最大值

问题描述

2 个解决方案

解决方案1 0 已采纳 2021-07-22 17:25:37

解决方案2 0 2021-07-22 17:41:41

解决方案1
0 已采纳 2021-07-22 17:25:37

解决方案2
0 2021-07-22 17:41:41