简体   繁体   English

根据特定列比较数据集中的行以找到最小值/最大值

[英]Comparing rows in dataset based on a specific column to find min/max

So I have a dataset that contains history of a specific tag from a start to end date.所以我有一个dataset ,其中包含从开始到结束日期的特定标签的历史记录。 I am trying to compare rows based on the a date column, if they're similar by month, day and year, I'll add those to a temporary list by the value of the next column and then once I have those items by similar date, I'll take that list and find the min/max values subtract them, then add the result to another list and empty the temp_list to start all over again.我正在尝试根据日期列比较行,如果它们按月、日和年相似,我将按下一列的值将它们添加到临时列表中,然后一旦我按相似获得这些项目日期,我将获取该列表并找到min/max减去它们,然后将结果添加到另一个列表并清空temp_list以重新开始。

For the sake of time and simplicity, I am just presenting a example of 2D List.为了时间和简单起见,我只是展示了一个二维列表的例子。 Here's my example data这是我的示例数据

dataset = [[1,5],[1,6],[1,10],[1,23],[2,4],[2,8],[2,12],[3,10],[3,20],[3,40],[4,50],[4,500]]

Where the first column will act as dates and second value .其中第一列将作为dates和第二个value

The issues I am having is :我遇到的问题是:

  1. I cant seem to compare every row based on its first column which would take the value in the second column and include it in the temp list to perform min/max operations?我似乎无法根据第一列比较每一行,这将采用第二列中的值并将其包含在临时列表中以执行最小/最大操作?
  2. Based on the above 2D List I would expect to get [18,8,30,450] but the result is [5,4,10]基于上面的二维列表,我希望得到[18,8,30,450]但结果是[5,4,10]
dataset = [[1,5],[1,6],[1,10],[1,23],[2,4],[2,8],[2,12],[3,10],[3,30],[3,40],[4,2],[4,5]]
temp_list = []
daily_total = []
for i in range(len(dataset)-1):
    if dataset[i][0] == dataset[i+1][0]:
        temp_list.append(dataset[i][1])
    else:
        max_ = max(temp_list)
        min_ = min(temp_list)
        total = max_ - min_
        daily_total.append(total)
        temp_list = []
            
print([x for x in daily_total])

Try:尝试:

tmp = {}
for d, v in dataset:
    tmp.setdefault(d, []).append(v)

out = [max(v) - min(v) for v in tmp.values()]
print(out)

Prints:印刷:

[18, 8, 30, 450]

Here is a solution using pandas:这是使用熊猫的解决方案:

import pandas as pd

dataset = [
    [1, 5],
    [1, 6],
    [1, 10],
    [1, 23],
    [2, 4],
    [2, 8],
    [2, 12],
    [3, 10],
    [3, 20],
    [3, 40],
    [4, 50],
    [4, 500],
]

df = pd.DataFrame(dataset)
df.columns = ["date", "value"]
df = df.groupby("date").agg(min_value=("value", "min"), max_value=("value", "max"))
df["res"] = df["max_value"] - df["min_value"]
df["res"].to_list()

Output:输出:

[18, 8, 30, 450]

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 根据列的最小值和最大值恢复所有行 - Python Pandas - Recover all rows based on min and max values of a column - Python Pandas 基于 Pandas 条件的行的最大值和最小值(取决于列名) - Max & Min for the rows based on conditions in Pandas (column name dependent) pandas dataset 2 column min and max 确定一列 - pandas dataset 2 column min and max determine a column 根据另一列查找一列的最大最小值 - find max-min values for one column based on another 在给定文件中查找列的数据类型,查找每列的最大值和最小值,如果是字符串,则根据长度查找最大值,最小值字符串 - To find datatypes of column in a file given, to find max and min value of each column, in case of string find max, min string based on length 在没有索引的最小值和最大值的情况下查找熊猫中每一列的最小值和最大值 - find min and max of each column in pandas without min and max of index Python Pandas - 根据具有相同组 id 的两列的最大值和最小值选择特定行 - Python Pandas - Selecting specific rows based on the max and min of two columns with the same group id Groupby列并查找每个组的最小值和最大值 - Groupby column and find min and max of each group Groupby序列按日期排序,根据其他列值查找min,max - Groupby sequence in order by date, find the min, max based on other column value 如何根据不同的日期列查找 csv 文件的最小值和最大值? - How to find the min and max values of a csv file based on a different column of dates?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM