我想从R数据框中的一列生成5个名称的组合，其在不同列中的值加起来等于或小于一定数量

Question

I have a data frame (UFC) with 4 columns. 我有一个4列的数据框（UFC）。

Column 1 (UFC$Name) is names of UFC fighters fighting this weekend. 第1栏（UFC $ Name）是本周末战斗的UFC战斗机的名称。

Column 2 (UFC$Salary) is how much they "cost" in a fantasy sports contest. 第2栏（UFC $ Salary）是他们在幻想体育比赛中“花费”的金额。

Column 3 (UFC$WinPct) is how likely the fighter is to win the fight. 第3栏（UFC $ WinPct）是战斗机赢得战斗的可能性。

Column 4 (UFC$FinishPct) is how likely the fighter is to win the fight without it going to a decision. 第4栏（UFC $ FinishPct）是战斗机在不做出决定的情况下赢得战斗的可能性。

I'd like to make a data frame that contains all (or more practically the top X number of them, based on the parameter I mention in the next paragraph) the combinations of 5 fighters from column 1, whose column 2 sums add up to $50,000 or less. 我想制作一个数据框，其中包含全部（或更确切地说，它们的前X个，根据我在下一段中提到的参数）第1列中5架战斗机的组合，第2列总和为$ 50,000以下。

Then what I'm really interested in, is the combinations of 5 Fighters whose column 4 sums are highest. 然后我真正感兴趣的是5架战斗机的组合，其第4列总和最高。

I'm getting pretty good at low level tinkering with data frames but this is a little too advanced for me to wrap my head around how to approach. 我在低级修补数据帧方面表现不错，但这对我来说太高级了，不足以让我全神贯注于处理方法。

Here is about 30% of the dataframe. 这大约是数据帧的30％。

              Name Salary WinPct FinishPct
    Keita Nakamura   9100  31.00     15.36
       George Roop   8900  33.00     15.76
   Teruto Ishihara   9000  33.00     17.08
    Naoyuki Kotani   8700  30.50     18.35
     Yusuke Kasuya   8500  29.60     21.16
  Katsunori Kikuno   8800  33.66     21.88

The desired output would look something like this: 所需的输出如下所示：

Lineup                                                                       
Roy Nelson,Gegard Mousasui,Yusuke Kasuya,George Roop,Diego Brandao      
SalarySum
47900     
FinishPctSum     
148.99

And it would return the top X number of those outputs, ranked by highest FinishPctSum 然后它将返回这些输出的前X个，按最高FinishPctSum排名

Answer 1

Well this won't be terribly fast but it's an idea ... 好吧，这并不会很快，但这是一个主意...

## make a list of all combinations of 5 of Name, Salary, and FinishPct
xx <- with(df, lapply(list(as.character(Name), Salary, FinishPct), combn, 5))
## convert the names to a string, 
## find the column sums of the others,
## set the names
yy <- setNames(
    lapply(xx, function(x) {
        if(typeof(x) == "character") apply(x, 2, toString) else colSums(x)
    }),
    names(df)[c(1, 2, 4)]
)
## coerce to data.frame
newdf <- as.data.frame(yy)

which results in 导致

#                                                                              Names Salary FinishPct
# 1      Keita Nakamura, George Roop, Teruto Ishihara, Naoyuki Kotani, Yusuke Kasuya  44200     87.71
# 2   Keita Nakamura, George Roop, Teruto Ishihara, Naoyuki Kotani, Katsunori Kikuno  44500     88.43
# 3    Keita Nakamura, George Roop, Teruto Ishihara, Yusuke Kasuya, Katsunori Kikuno  44300     91.24
# 4     Keita Nakamura, George Roop, Naoyuki Kotani, Yusuke Kasuya, Katsunori Kikuno  44000     92.51
# 5 Keita Nakamura, Teruto Ishihara, Naoyuki Kotani, Yusuke Kasuya, Katsunori Kikuno  44100     93.83
# 6    George Roop, Teruto Ishihara, Naoyuki Kotani, Yusuke Kasuya, Katsunori Kikuno  43900     94.23

No check has been performed to determine whether the salaries are less than 50k. 没有执行任何检查来确定工资是否少于50k。 It just gives all the combinations of 5 fighters with their respective sums. 它只给出了5名战士的所有组合及其各自的总和。 You can subset to find those salaries less than 50k with 您可以子集查找薪水少于50k的那些

newdf[newdf$Salary <= 5e4, ]

Note that 5e4 is shorthand/scientific notation for 50,000. 请注意5e4是50,000的简写/科学计数法。

Data: 数据：

df <- structure(list(Name = structure(c(3L, 1L, 5L, 4L, 6L, 2L), .Label = c("George Roop", 
"Katsunori Kikuno", "Keita Nakamura", "Naoyuki Kotani", "Teruto Ishihara", 
"Yusuke Kasuya"), class = "factor"), Salary = c(9100L, 8900L, 
9000L, 8700L, 8500L, 8800L), WinPct = c(31, 33, 33, 30.5, 29.6, 
33.66), FinishPct = c(15.36, 15.76, 17.08, 18.35, 21.16, 21.88
)), .Names = c("Name", "Salary", "WinPct", "FinishPct"), class = "data.frame", row.names = c(NA, 
-6L))

我想从R数据框中的一列生成5个名称的组合，其在不同列中的值加起来等于或小于一定数量

问题描述

1 个解决方案

解决方案1
1 2015-09-24 02:20:56

我想从R数据框中的一列生成5个名称的组合，其在不同列中的值加起来等于或小于一定数量

问题描述

1 个解决方案

解决方案1 1 2015-09-24 02:20:56

解决方案1
1 2015-09-24 02:20:56