R - 如何找到多列的前 10%？

Question

For school, we are trying to find the top 10% of colleges in temrs of PhD's, Grad.Rate, and Enrollment.对于学校，我们试图找到博士、Grad.Rate 和 Enrollment 排名前 10% 的大学。 We are using the ISLR's College dataframe.我们使用的是 ISLR 的学院 dataframe。

I have tried using head() with order() to order them well, but I am not really sure if all three of these colleges need to be within the top ten percent of each category.我曾尝试使用head()和order()来很好地订购它们，但我不确定这三所大学是否都需要在每个类别的前 10% 之内。

The actual question verbatim: 'Create a dataframe that just includes the colleges that are in the top 10% in terms of PhD's, Grad.Rate and Enrollment.'逐字逐句的实际问题：“创建一个 dataframe，其中仅包括博士、毕业率和入学率排名前 10% 的大学。”

Thank you so much.太感谢了。

Answer 1

First, create a vector indicating whether a college is in the top 10 or not for a specific variable:首先，为特定变量创建一个向量，指示一所大学是否在前 10 名中：

College$PhD_top10 <- ifelse(College$PhD >= quantile(College$PhD, probs = 0.9), TRUE, FALSE)

Repeat this for as many variables as you need.根据需要对尽可能多的变量重复此操作。

Then subset the data frame based on those variables:然后根据这些变量对数据框进行子集化：

College[College$PhD_top10, ] # Add & to string along other created variables.

Answer 2

Try using quantile function尝试使用分位数 function

quantile(x, probs = seq(0, 1, by= 0.1)) # decile

R - 如何找到多列的前 10%？

问题描述

2 个解决方案

解决方案1
0 2021-02-22 07:11:05

解决方案2
-1 2021-02-22 05:17:42

R - 如何找到多列的前 10%？

问题描述

2 个解决方案

解决方案1 0 2021-02-22 07:11:05

解决方案2 -1 2021-02-22 05:17:42

解决方案1
0 2021-02-22 07:11:05

解决方案2
-1 2021-02-22 05:17:42