[英]return max value from pandas dataframe as a whole, not based on column or rows
I am trying to get the max value from a panda dataframe as whole.我正在尝试从熊猫 dataframe 整体中获取最大值。 I am not interested in what row or column it came from.
我对它来自哪一行或哪一列不感兴趣。 I am just interested in a single max value within the DataFrame.
我只对 DataFrame 中的单个最大值感兴趣。
Here is my DataFrame:这是我的 DataFrame:
df = pd.DataFrame({'group1': ['a','a','a','b','b','b','c','c','d','d','d','d','d'],
'group2': ['c','c','d','d','d','e','f','f','e','d','d','d','e'],
'value1': [1.1,2,3,4,5,6,7,8,9,1,2,3,4],
'value2': [7.1,8,9,10,11,12,43,12,34,5,6,2,3]})
This is what it looks like:这是它的样子:
group1 group2 value1 value2
0 a c 1.1 7.1
1 a c 2.0 8.0
2 a d 3.0 9.0
3 b d 4.0 10.0
4 b d 5.0 11.0
5 b e 6.0 12.0
6 c f 7.0 43.0
7 c f 8.0 12.0
8 d e 9.0 34.0
9 d d 1.0 5.0
10 d d 2.0 6.0
11 d d 3.0 2.0
12 d e 4.0 3.0
Expected output:预期 output:
43.0
I was under the assumption that df.max() would do this job but it returns a max value for each column but I am not interested in that.我假设 df.max() 会完成这项工作,但它会为每列返回一个最大值,但我对此不感兴趣。 I need the max from an entire dataframe.
我需要整个 dataframe 的最大值。
The max of all the values in the DataFrame can be obtained using df.to_numpy().max()
, or for pandas < 0.24.0
we use df.values.max()
: DataFrame 中所有值的最大值可以使用
df.to_numpy().max()
,或者对于pandas < 0.24.0
我们使用df.values.max()
:
In [10]: df.to_numpy().max()
Out[10]: 'f'
The max is f
rather than 43.0 since, in CPython2,最大值是
f
而不是 43.0,因为在 CPython2 中,
In [11]: 'f' > 43.0
Out[11]: True
In CPython2, Objects of different types ... are ordered by their type names .在 CPython2 中,不同类型的对象......按它们的类型名称排序。 So any
str
compares as greater than any int
since 'str' > 'int'
.所以任何
str
都比任何int
因为'str' > 'int'
。
In Python3, comparison of strings and ints raises a TypeError
.在 Python3 中,字符串和整数的比较会引发
TypeError
。
To find the max value in the numeric columns only, use要仅在数字列中查找最大值,请使用
df.select_dtypes(include=[np.number]).max()
Hi the simplest answer is the following.您好,最简单的答案如下。 Answer:
回答:
df.max().max()
Explanation:解释:
series = df.max()
give you a Series containing the maximum values for each column. series = df.max()
给你一个包含每列最大值的系列。
Therefore series.max()
gives you the maximum for the whole dataframe.因此
series.max()
为您提供整个数据帧的最大值。
:) best answers are usually the simplest :) 最好的答案通常是最简单的
An alternative way:另一种方式:
df.melt().value.max()
Essentially melt()
transforms the DataFrame into one long column.本质上,
melt()
将 DataFrame 转换为一长列。
Max can be found in these two steps: Max可以在这两个步骤中找到:
maxForRow = allData.max(axis=0) #max for each row
globalMax = maxForRow.max(); #max across all rows
For the max, check the previous answer... For the max of the values use eg:对于最大值,请检查上一个答案...对于值的最大值,请使用例如:
val_cols = [c for c in df.columns if c.startswith('val')]
print df[val_cols].max()
using numpy max最大使用 numpy
np.max(df.values)
or或者
np.nanmax(df.values)
or in pandas或在 pandas
df.values.max()
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.