简体   繁体   English

pandas DataFrame 某些列的中值

[英]pandas DataFrame median for certain columns

Trying to figure out how to calculate the median for values of certain columns within a pandas DataFrame.试图弄清楚如何计算 pandas DataFrame 中某些列的值的中值。 Say for instance I have a DataFrame of 7 columns and 200 rows and I want to extract the numbers contained in columns of index 1-3 (including) and calculate the median for the total of all rows combined;例如,我有一个 7 列和 200 行的 DataFrame ,我想提取索引 1-3 (包括)列中包含的数字并计算所有行总和的中位数; for 3 rows it would be the median for (x+y+z) + (x+y+z) + (x+y+z).对于 3 行,它将是 (x+y+z) + (x+y+z) + (x+y+z) 的中位数。

I've tried:我试过了:

df["median"] = df.apply(lambda x : median(x), df[2:4])

but it raises the error:但它引发了错误:

`TypeError: 'DataFrame' objects are mutable, thus they cannot be hashed

I've also tried:我也试过:

x = df["column1"]
y = df["column2"]
z = df["column3"]
median_nums = [x,y,z]

but the list isn't suitable and I'm not managing to extract the numbers themselves from the DataFrame in order to used statistics.median on them.但该列表不合适,我无法从 DataFrame 中提取数字本身,以便对其使用 statistics.median。 The same error as above is raised引发与上述相同的错误

Help would be extremely appreciated帮助将不胜感激

You can select before median你可以在 select 前median

df['New']=df.iloc[:,2:4].median(axis=1)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM