R: grouping numbers into bins

Question

I am looking to find the smallest number in a column in a data frame that is larger a number in another array.

Example
DistrDF

Bin Freq CumSum  
0.1 0.05 0.05  
0.2 0.07 0.12    
0.3 0.20 0.32  
0.4 0.10 0.42  
0.5 0.00 0.42   
0.6 0.15 0.57  
0.7 0.00 0.57  
0.8 0.30 0.87  
0.9 0.11 0.98  
1.0 0.02 1.0

Then I have an array of, say, 10 random numbers between 0 and 1 (ie each random number will fall into one of the bins in the DistrDF)

RandNums
0.13
0.50
0.11
0.10
0.70
0.05
0.12
0.80
0.88
0.40

I would like to use these two table to create a third table, which indicates into which bin each of the random numbers falls, as below:

ResultDF  
0.30 (because 0.13 < 0.32 and 0.13 > 0.12)
0.60 (because 0.50 < 0.57 and 0.50 > 0.42)
...
0.30 (because 0.40 < 0.42 and 0.40 > 0.32)

Does anyone have any ideas? I feel like an aggregate or something might be in order, but I'm not sure.

Answer 1

The cut function does what you want:

DistrDF <- DistrDF[DistrDF$Freq > 0,]  # Remove empty bins
DistrDF$Bin[cut(x$RandNums, c(0, DistrDF$CumSum))]
# [1] 0.3 0.6 0.2 0.2 0.8 0.1 0.2 0.8 0.9 0.4

You can manipulate the include.lowest and right parameters to change how you handle points that fall on the border of bins.

R: grouping numbers into bins

Question

1 answers

solution1
1 ACCPTED 2015-07-30 16:48:20

R: grouping numbers into bins

Question

1 answers

solution1 1 ACCPTED 2015-07-30 16:48:20

solution1
1 ACCPTED 2015-07-30 16:48:20