[英]Produce new column in data frame by assigning values based on quantiles in R?
Let's make a dummy vector called INCOME <- rnorm(1:1000, 500, 100)
让我们创建一个名为INCOME <- rnorm(1:1000, 500, 100)
的虚拟向量
Then let's take quantiles using function 'quantile': INCOME_QUANTILES <- quantile(INCOME, probs=c(0.05, 0.50, 1.00))
然后让我们使用 function 'quantile' 获取分位数: INCOME_QUANTILES <- quantile(INCOME, probs=c(0.05, 0.50, 1.00))
Now I want to make a new vector called INCOME QUANTILE and attach this to my vector INCOME to create a data frame of 2 columns (INCOME / INCOME QUANTILE) of 1000 observations.现在我想创建一个名为 INCOME QUANTILE 的新向量并将其附加到我的向量 INCOME 以创建一个包含 1000 个观察值的 2 列(INCOME / INCOME QUANTILE)的数据框。 In this new vector should go a value of 1, 2, or 3, depending on which income quantile that observation falls into, so a value of 1 = 0.05 quantile, 2 = 0.50 quantile, and 3 = 1.00 quantile.在此新向量中,go 的值应为 1、2 或 3,具体取决于观察结果属于哪个收入分位数,因此值 1 = 0.05 分位数,2 = 0.50 分位数,3 = 1.00 分位数。
So for example, if the first observation of income falls into the 1.00 quantile, and the second observation falls into the 0.50 quantile, it'll look like:因此,例如,如果收入的第一个观测值落入 1.00 分位数,而第二个观测值落入 0.50 分位数,则它看起来像:
INCOME INCOME QUANTILE
550.50 3
415.20 2
It's been suggested by a friend to create a for loop, but I'm honestly not sure at all how to go about that.一位朋友建议创建一个 for 循环,但老实说我完全不确定如何 go 。 Any help would be very appreciated!任何帮助将不胜感激!
You can try this:你可以试试这个:
INCOME <- rnorm(1:1000, 500, 100)
INCOME_QUANTILES <- quantile(INCOME, probs=c(0, 0.05, 0.50, 1.00))
df <- data.frame(INCOME,
INCOME_GOUP = as.numeric(cut(INCOME, breaks = INCOME_QUANTILES, include.lowest = TRUE)))
Note that I had to add 0 as the lowest quantile.请注意,我必须添加 0 作为最低分位数。 So it's 0-0.05 = 1, >.05-.5 = 2, >.5 = 3.所以它是 0-0.05 = 1,>.05-.5 = 2,>.5 = 3。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.