![](/img/trans.png)
[英]I want to generate 8 combinations of names from a column in an R data frame based on conditions from other columns in the same data frame
[英]I want to generate combinations of 5 names from a column in an R data frame, whose values in a different column add up to a certain number or less
我有一個4列的數據框(UFC)。
第1欄(UFC $ Name)是本周末戰斗的UFC戰斗機的名稱。
第2欄(UFC $ Salary)是他們在幻想體育比賽中“花費”的金額。
第3欄(UFC $ WinPct)是戰斗機贏得戰斗的可能性。
第4欄(UFC $ FinishPct)是戰斗機在不做出決定的情況下贏得戰斗的可能性。
我想制作一個數據框,其中包含全部(或更確切地說,它們的前X個,根據我在下一段中提到的參數)第1列中5架戰斗機的組合,第2列總和為$ 50,000以下。
然后我真正感興趣的是5架戰斗機的組合,其第4列總和最高。
我在低級修補數據幀方面表現不錯,但這對我來說太高級了,不足以讓我全神貫注於處理方法。
這大約是數據幀的30%。
Name Salary WinPct FinishPct
Keita Nakamura 9100 31.00 15.36
George Roop 8900 33.00 15.76
Teruto Ishihara 9000 33.00 17.08
Naoyuki Kotani 8700 30.50 18.35
Yusuke Kasuya 8500 29.60 21.16
Katsunori Kikuno 8800 33.66 21.88
所需的輸出如下所示:
Lineup
Roy Nelson,Gegard Mousasui,Yusuke Kasuya,George Roop,Diego Brandao
SalarySum
47900
FinishPctSum
148.99
然后它將返回這些輸出的前X個,按最高FinishPctSum排名
好吧,這並不會很快,但這是一個主意...
## make a list of all combinations of 5 of Name, Salary, and FinishPct
xx <- with(df, lapply(list(as.character(Name), Salary, FinishPct), combn, 5))
## convert the names to a string,
## find the column sums of the others,
## set the names
yy <- setNames(
lapply(xx, function(x) {
if(typeof(x) == "character") apply(x, 2, toString) else colSums(x)
}),
names(df)[c(1, 2, 4)]
)
## coerce to data.frame
newdf <- as.data.frame(yy)
導致
# Names Salary FinishPct
# 1 Keita Nakamura, George Roop, Teruto Ishihara, Naoyuki Kotani, Yusuke Kasuya 44200 87.71
# 2 Keita Nakamura, George Roop, Teruto Ishihara, Naoyuki Kotani, Katsunori Kikuno 44500 88.43
# 3 Keita Nakamura, George Roop, Teruto Ishihara, Yusuke Kasuya, Katsunori Kikuno 44300 91.24
# 4 Keita Nakamura, George Roop, Naoyuki Kotani, Yusuke Kasuya, Katsunori Kikuno 44000 92.51
# 5 Keita Nakamura, Teruto Ishihara, Naoyuki Kotani, Yusuke Kasuya, Katsunori Kikuno 44100 93.83
# 6 George Roop, Teruto Ishihara, Naoyuki Kotani, Yusuke Kasuya, Katsunori Kikuno 43900 94.23
沒有執行任何檢查來確定工資是否少於50k。 它只給出了5名戰士的所有組合及其各自的總和。 您可以子集查找薪水少於50k的那些
newdf[newdf$Salary <= 5e4, ]
請注意5e4
是50,000的簡寫/科學計數法。
數據:
df <- structure(list(Name = structure(c(3L, 1L, 5L, 4L, 6L, 2L), .Label = c("George Roop",
"Katsunori Kikuno", "Keita Nakamura", "Naoyuki Kotani", "Teruto Ishihara",
"Yusuke Kasuya"), class = "factor"), Salary = c(9100L, 8900L,
9000L, 8700L, 8500L, 8800L), WinPct = c(31, 33, 33, 30.5, 29.6,
33.66), FinishPct = c(15.36, 15.76, 17.08, 18.35, 21.16, 21.88
)), .Names = c("Name", "Salary", "WinPct", "FinishPct"), class = "data.frame", row.names = c(NA,
-6L))
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.