简体   繁体   English

熊猫-每一列中的最小值

[英]Pandas - min of a column for each value in other

I have a CSV file as follows: 我有一个CSV文件,如下所示:

Date, Name
2015-01-01 16:30:00.0, John
2015-02-11 16:30:00.0, Doe
2015-03-01 16:30:00.0, Sam
2015-03-05 16:30:00.0, Sam
2015-04-21 16:30:00.0, Chris
2015-05-07 16:30:00.0, John
2015-06-08 16:30:00.0, Doe

You can see that same name is repeated on multiple date. 您会看到同一名称在多个日期重复出现。 I want to know for each unique name, what is the MAX date in date column. 我想知道每个唯一名称的日期列中的最大日期是多少。 How to do this with Pandas or other solution if you know any in Python? 如果您对Python有所了解,如何使用Pandas或其他解决方案来做到这一点?

I want the result like: 我想要这样的结果:

Name, Max date(or latest)
John, 2015-01-01 16:30:00.0
Doe, 2015-01-01 16:30:00.0
Sam, 2015-01-01 16:30:00.0
Chris, 2015-01-01 16:30:00.0

You want to do DataFrame.groupby() and then on it call - .max() / .min() (Depending on what you want) . 你想干什么DataFrame.groupby()然后就可以调用- .max() / .min()取决于你想要什么)。 Example - 范例-

df.groupby('Name').max()

You would also need to make sure that when you read in the csv, you parse the 'Date' column as datetime, by using the dtype argument for .read_csv() method (as given below in the example). 您还需要确保当你在阅读的CSV,您解析'Date'列日期时间,使用dtype的参数.read_csv()方法(如在下面的例子中给出)。


Example/Demo (For your csv example in Question) - 范例/演示(针对您在问题中的csv范例)-

In [12]: df = pd.read_csv('a.csv',dtype={'Date':pd.datetime,'Name':str})

In [13]: df
Out[13]:
                    Date   Name
0  2015-01-01 16:30:00.0   John
1  2015-02-11 16:30:00.0    Doe
2  2015-03-01 16:30:00.0    Sam
3  2015-03-05 16:30:00.0    Sam
4  2015-04-21 16:30:00.0  Chris
5  2015-05-07 16:30:00.0   John
6  2015-06-08 16:30:00.0    Doe

In [15]: df.groupby(['Name']).max()
Out[15]:
                        Date
Name
Chris  2015-04-21 16:30:00.0
Doe    2015-06-08 16:30:00.0
John   2015-05-07 16:30:00.0
Sam    2015-03-05 16:30:00.0

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 熊猫-如何动态获取列中每个会话的最小值和最大值 - Pandas - How to dynamically get min and max value of each session in the column 根据 pandas 中每列的最小值对列进行排序 - Sort columns based on min value of each column in pandas Python Pandas - 根据其他列获取列的最小值并对另一列进行分组 - Python Pandas - Get min value of a column based on other column and group the other column Pandas - 其他列中每种值类型的值的计数频率 - Pandas - counting frequency of value for each value type in other column 熊猫:如何按时间将其他列中的每个值分组 - Pandas: How to group by time for each value in other column 对于 pandas 列中的每个唯一值,对其他列进行排序 - For each unique value in pandas column, sort other columns 在没有索引的最小值和最大值的情况下查找熊猫中每一列的最小值和最大值 - find min and max of each column in pandas without min and max of index 在 groupby pandas 中对包含此列值的每个列值和其他列值求和 - Sum each column value and other column values that contain this column value in groupby pandas Python numpy:有效获取包含其他3列的每个唯一元组的列的最小值的行 - Python numpy: Efficiently get rows containing min value of column for each unique tuple of 3 other columns 在给定的时间间隔中以熊猫为单位查找列的最小值 - Finding the min value of a column for a given interval in Pandas
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM