简体   繁体   中英

R solver optimization problem

I have a data frame in r :

   buys ges   dif bin
1 22.34  12 10.34   0
2 55.56  12 43.56   0
3 78.33  12 66.33   0
4  9.99  12  2.01   1
..   ..  ..   ..    ..   

dif is just abs(buys-ges) and bin is an ifelse formula that is 1 if dif is <=10 and 0 otherwise. I'm trying to maximize the sum of the bin column by changing the ges column. The constraint is that ges is the same for all rows. I've tried a couple packages but can't figure out maximizing or optimizing. Thanks for any suggestions.

buys <- rnorm(1:100)
> buys <- data.frame(a*100)
> buys <- round(abs(a), 2)
> summary(buys)
    a...100             gs          dif              bin      
 Min.   :  0.89   Min.   :15   Min.   :  1.76   Min.   :0.00  
 1st Qu.: 38.29   1st Qu.:15   1st Qu.: 23.29   1st Qu.:0.00  
 Median : 72.89   Median :15   Median : 57.88   Median :0.00  
 Mean   : 83.91   Mean   :15   Mean   : 70.52   Mean   :0.13  
 3rd Qu.:123.50   3rd Qu.:15   3rd Qu.:108.50   3rd Qu.:0.00  
 Max.   :269.11   Max.   :15   Max.   :254.11   Max.   :1.00  
> gs1 <- 5
> buys$gs <- gs1
> buys$dif <- abs(buys[,1]  - buys$gs)
> buys$bin <- ifelse(buys$dif<=10,1,0)
> colnames(buys) <- c("buys","gs","dif","bin")
> head(buys)
    buys gs    dif bin
1   7.48  5   2.48   1
2  79.08  5  74.08   0
3 139.22  5 134.22   0
4  41.60  5  36.60   0
5  38.35  5  33.35   0
6 157.72  5 152.72   0
> sum(buys$bin)
[1] 10
> num_buys=function(x)
+ {
+   return(length(buys$buys[buys$buys>=x-10 | buys$buys<=x+10]))
+ }
> ans2 <- optimize(f=num_buys,interval=c(min(buys$buys),max(buys$buys)),maximum=TRUE)
> 
> 
> ans2 
$maximum
[1] 269.1099

$objective
[1] 100

Since values of bin are either 0 or 1, for a given value of ges , we're really just counting the number of elements in buys that are in the interval [ges-10,ges+10] . Visually, one could imagine "sliding" the interval [ges-10,ges+10] starting at ges=min(buys) and ending at ges=max(buys) and counting the number of entries of buys that are in the interval as the value of a function. In particular:

num_buys=function(x)
{
  return(length(buys[buys>=x-10 & buys<=x+10]))
}

With that, we can use optimize to find a maximum:

optimize(f=num_buys,interval=c(min(buys),max(buys)),maximum=TRUE)

As an example:

> buys=rnorm(10000,mean=50,sd=10)
> summary(buys)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
  11.38   43.22   50.01   50.06   56.93   92.76
> num_buys=function(x){return(length(buys[buys<=x+10 & buys>=x-10]))}
> optimize(f=num_buys,interval=c(min(buys),max(buys)),maximum=TRUE)
$maximum
[1] 50.16788

$objective
[1] 6808

So, in this case, a maximum value of sum(bin) would be 6808, and this maximum would occur when ges=50.16788 . Of course, this makes perfect sense, since about 68% of the values should occur within 10 units of 50 (normal distribution and all that). :D

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM