简体   繁体   中英

create frequency tables using rollapply() in R

I have a data set like this:

Just 1 table with 2 columns. The first column runs from 1 to 100 and in the second we have random numbers. for eg

x    y
1    25
2    51
3    0   
-    --
48   250
49   500
50   1000
-    ---       --and so on till
100  600

Now , I need to choose a window of first 50 rows (x = 1 to x= 50).After this I need to find the range of these 50 y values, which is y[max]-y[min]. Then , I need to divide the range by 10 to create my frequency table. In the above example, range = 1000 - 0 = 1000. 1000/10 = 100.

so, my frequency table will look like

0 - 100   count value of y between 0 to 100 say i(0-100)
100 - 200 count value

and so on till

900-1000  count value.

I need to get total count value say " total ". After this, I need to take the ratio of respective count value to total count . For the first row , it would be i(0-100)/total. For the second row , it would be i(100-200)/total and so on. Lets call these values as f. so, f1 = i(0-100)/total ; f2 = i(0-100)/total and so on.

After this , I need to compute summation [ f * ln(f) ]. So, our return value is summation [ f * ln(f)] for the window from 1 to 50. After this , I need to slide this window by 1 position, which can be achieved by roll apply() function in R.

You can use this code to start :

library(zoo)
set.seed(1)
foo <- runif(100)
foo[c(58,59)] <- 0
rollapply(foo,width=50,FUN=function(xx){ ....})

I want some help to complete this "function(xx){ ....})" in the above code. I am struggling to put all the information that i said into this one small function.

Updated answer... I hope following will help you... Though its not exactly what you wanted

set.seed(1)
foo <- sample(1:1000,100)

#Rolling function
rollFreq = function(x, binCount){
  temp = hist(x, breaks = binCount, plot = FALSE)
  return = temp$counts
}

rollapply(foo, width=50, FUN=rollFreq, binCount = 10)

      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
 [1,]    4    6    2    5    6    5    8    4    5     5
 [2,]    4    6    1    5    6    5    8    4    6     5
 [3,]    4    6    1    5    6    5    8    5    5     5
 [4,]    4    6    1    5    5    5    8    6    5     5
 [5,]    4    6    1    5    4    5    8    6    5     6
 [6,]    4    6    1    5    4    5    7    6    5     7
 [7,]    4    6    1    5    4    6    7    5    5     7
 [8,]    5    5    1    5    4    6    7    5    5     7
 [9,]    5    5    1    6    4    6    7    5    4     7
[10,]    5    5    2    6    4    6    7    5    4     6
[11,]    5    6    2    6    4    5    7    5    4     6
[12,]    6    6    2    6    3    5    7    5    4     6
[13,]    6    5    2    7    3    5    7    5    4     6
[14,]    5    5    2    8    3    5    7    5    4     6
[15,]    5    6    2    8    3    5    7    5    3     6
[16,]    5    5    2    8    3    6    7    5    3     6
[17,]    4    5    2    9    3    6    7    5    3     6
[18,]    4    5    2    9    3    6    7    5    4     5
[19,]    4    5    1    9    3    6    7    5    4     6
[20,]    4    4    1   10    3    6    7    5    4     6
[21,]    4    5    1    9    3    6    7    5    4     6
[22,]    4    6    1    9    2    6    7    5    4     6
[23,]    4    6    1    9    1    6    7    5    4     7
[24,]    4    6    1   10    1    5    7    5    4     7
[25,]    4    6    1   10    1    6    7    5    3     7
[26,]    3    6    1   10    1    6    8    5    3     7
[27,]    3    6    1   10    1    5    8    5    4     7
[28,]    3    6    1   10    1    6    7    5    4     7
[29,]    4    6    1   10    1    6    6    5    4     7
[30,]    4    6    1   10    1    6    5    5    5     7
[31,]    4    6    1   10    0    6    5    6    5     7
[32,]    4    6    1   10    0    6    5    6    5     7
[33,]    5    6    1    9    0    6    5    6    5     7
[34,]    6    6    1    9    0    6    5    6    4     7
[35,]    6    6    1    9    0    6    5    6    4     7
[36,]    6    6    1    9    0    5    5    7    4     7
[37,]    6    6    1    9    0    6    5    7    4     6
[38,]    6    7    1    9    0    6    5    7    4     5
[39,]    5    7    1   10    0    6    5    7    4     5
[40,]    5    7    1   10    0    6    6    6    4     5
[41,]    5    7    1   10    0    6    5    6    4     6
[42,]    5    6    1   10    0    6    6    6    4     6
[43,]    5    6    2   10    0    6    5    6    4     6
[44,]    5    6    2   10    0    6    5    6    4     6
[45,]    5    6    3    9    0    6    5    6    4     6
[46,]    5    6    3    9    0    6    4    6    5     6
[47,]    5    7    3    9    0    6    4    5    5     6
[48,]    5    7    3   10    0    6    3    5    5     6
[49,]    5    7    4    9    0    6    3    5    5     6
[50,]    6    6    4    9    0    6    3    5    5     6
[51,]    6    6    4    8    0    6    3    6    5     6

What I could make out from your description is wrapped in the following function:

ff = function(x) 
{
   rg = range(x)
   f = prop.table(table(cut(x, 
                            do.call(seq, 
                                    c(as.list(rg), 
                                      list(diff(rg) / 10))))))
   sum(f * log(f))                                    
}

Using your "foo" I get:

sapply(head(seq_along(foo), (50 - 1)), 
       function(i) ff(foo[i:(i + (50 - 1))]))
# [1] -2.247295 -2.231095 -2.240361 -2.227678 -2.239769
# [6] -2.244925 -2.239769 -2.223568 -2.246704 -2.251620
#[11] -2.238672 -2.245751 -2.251015 -2.244540 -2.244540
#[16] -2.248029 -2.235686 -2.226600 -2.258055 -2.271002
#[21] -2.278686 -2.255815 -2.251620 -2.251620 -2.231593
#[26] -2.215659 -2.207976 -2.192042 -2.192042 -2.195029
#[31] -2.194966 -2.174102 -2.168838 -2.138807 -2.118781
#[36] -2.127867 -2.127867 -2.130853 -2.130853 -2.143801
#[41] -2.173831 -2.181514 -2.163534 -2.190973 -2.167729
#[46]       NaN       NaN       NaN       NaN

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM